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Abstract 



As a computationally less costly test case for full QCD, we investigate an 
SU(3) Yang-Mills theory coupled to a bosonic spinor field. This theory cor- 
responds to QCD with minus two quark fiavors and is known as the bermion 
model. Our central object of interest is the step scaling function which de- 
scribes the scale evolution of the running coupling in the Schrodinger func- 
tional scheme. With the help of a non-perturbative recursive finite size tech- 
nique, it can be used to determine the A parameter, which characterizes the 
coupling at high energy, from experimental input at low energies. 

We study in detail the lattice artefacts and the continuum extrapolation 
of the step scaling function from lattice simulations when 0(a) improvement 
according to the Symanzik programme is used. Our results are compared to 
the unimproved bermion and dynamical fermion cases, and to renormalized 
perturbation theory in the continuum limit. 

For the bermion model, we also examine the step scaling function with 
massive quarks. According to the Appelquist-Carazzone theorem the con- 
tributions from matter fields arc expected to vanish for large masses, such 
that the step scaling function converges to the pure gauge theory case. If one 
wants to connect non-perturbatively different effective theories with different 
numbers of active quarks over fiavor thresholds, lattice artefacts should be 
reasonably small. In order to test the feasibility of such a method, we inves- 
tigate the step scaling function and its lattice artefacts for several values of 
the mass. 

For the Monte Carlo simulation of improved bermions, we develop a suit- 
able algorithm and compare its performance with unimproved bermions and 
full QCD. As a preparative study, we compare the efficiency of algorithms in 
pure gauge theory. 
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Zusammenfassung 



Wir untersuchen eine SU(3) Yang-Mills-Theorie mit einer Kopplung an ein 
bosonisches Spinorfeld. Diese als Bermion-Modell bekannte Theorie entspricht 
formal QCD mit minus zwei Quark-Flavors. Gegenuber der voUen QCD 
erfordert sie wesentlich weniger Computerzeit und ist deshalb als relativ 
kostcngunstigcs Testmodcll gccignct. Im Mittelpunkt unscrcs Intcrcsses ste- 
ht die Step-Scaling-Funktion, die die Skalenabhangigkeit der laufenden Kop- 
plung im Schrodinger-Funktional-Renormierungsschema beschreibt. Mit Hil- 
fe einer nicht-perturbativen Finite-Size- Technik kann sie benutzt werden, um 
den A-Parameter, der die Kopplung bei hohen Energien charakterisiert, aus 
experimcntcllcn Daten bei niedrigen Energien zu bcstimmcn. 

Wir studieren im Detail die Gitterartefakte und die Kontinuumsextrap- 
olation der aus Gittersimulationen bestimmten Step- Scaling- Funktion, wenn 
0(a)-Verbesserung nach Symanzik verwendet wird. Unsere Resultate stellen 
wir dem Fall von unverbesserten Bermionen und dynamischen Fermionen 
gegeniiber, und vergleichen im Kontinuumslimes mit renormierter Storungs- 
theorie. 

Wcitcrhin bctrachten wir im Bcrmion-Modcll die Stcp-Scaling-Funktion 
mit massiven Quarks. Nach dem Appelquist-Carazzone-Theorem erwartet 
man, daB Beitrage von Materiefeldern mit ansteigender Masse verschwinden, 
so dafi die Step-Scaling-Funktion gcgcn den Fall reiner Eichthcoric konvergieren 
sollte. Wenn man nicht-perturbativ verschiedene effcktivc Thcoricn mit ver- 
schiedener Anzahl von aktiven Quarks iiber Massenschwcllcn hinweg verbinden 
will, soUten Gitterartefakte klein sein. Um die Durchfiihrbarkeit einer solchen 
Methode zu testen, untersuchen wir die Step-Scaling-Funktion und ihre Git- 
terartefakte flir verschiedene Massen. 

Fiir die Monte-Carlo-Simulation von verbesserten Bermionen entwick- 
eln wir cinen geeigneten Algorithmus und vergleichen seine Effizienz mit 
unbesserten Bermionen und mit voUer QCD. Als vorbereitende Studie ver- 
gleichen wir die Effizienz verschiedener Algorithmen in reiner Eichtheorie. 
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Chapter 1 
Introduction 



In the standard model of particle physics, Quantum Chromodynamics (QCD) 
is the theory of the strong interaction. It covers the interaction between six 
flavors of quarks which are the constituents of hadronic matter. QCD is a 
gauge theory based on the non-abelian SU(3) gauge group. The fermionic 
matter fields, the quarks, carry a quantum number called "color" and trans- 
form according to the fundamental representation of the group. The gauge 
fields transform according to the adjoint representation and describe an octet 
of gluons. 

Classically, the structure of the strong interaction is relatively simple 
compared to the electroweak theory. The gauge group is unbroken and the 
quark states that participate in the strong interaction coincide with the mass 
eigenstates. The only input parameters are the strong coupling constant and 
the quark masses. As a quantum field theory, QCD nevertheless exposes a 
lot of interesting phenomena, part of which are insuffiently understood even 
after thirty years of research. 

The quantization of continuum field theories leads to divergences that 
must be removed by regularizing the theory. One can obtain finite results for 
physical observables by a renormalization of the parameters of the theory, 
which become functions of a renormalization scale. The running coupling ctg 
of QCD plays an essential role in the characterization of the theory. 

An important property of non-abelian gauge theories is asymptotic free- 
dom. At high energies, the coupling vanishes asymptotically and quarks 
behave like free particles. Before the discovery of QCD, the phenomeno- 
logical parton model used for deep inelastic scattering processes reflected 
this behavior. The application of perturbation theory to QCD successfully 
describes the corrections to the Bjorken scaling law that follows from the 
assumption of free particles. 

On the other hand, the behavior of the running coupling is such that the 
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coupling grows at low energies. As a consequence, the traditional perturba- 
tive methods in Quantum Field Theory, which were developed in the frame- 
work of Quantum Electrodynamics (QED), break down in this regime. A new 
feature arising at large distances is the phenomenon of confinement, which 
expresses the observation that free quarks do not exist in nature. Hadronic 
matter appears only in the form of color singlets, which are grouped into 
mesons (quark-antiquark hadrons) and baryons (hadrons consisting of three 
quarks) . 

A decisive step for the understanding of low energy QCD was done by 
Wilson in 1974, who formulated this theory as a lattice regularized euclidean 
Quantum Field Theory. In this regularization scheme, matter fields are de- 
fined on the sites of a hypercubic space-time lattice and gauge fields are pa- 
rametrized as parallel transporters between the sites. The continuum limit 
is obtained by decreasing the lattice spacing to zero. This approach opened 
the possibility of applying methods from the toolbox of statistical physics, 
like the strong coupling (high temperature) expansion. Today, Monte Carlo 
simulations on the lattice have become one of the most popular methods for 
the non-perturbative investigation of QCD. 

The scale dependences of the running coupling and the running quark 
masses in QCD are described by the renormalization group equations. As 
these are differential equations of first order, in QCD with A^f quark flavors 
there are A^f + 1 integration constants which have to be determined from 
experiment. One way to express the free parameter of the running coupling is 
the A parameter which is a measure for the asymptotic decay of the coupling 
at high energies. Another way - used for example in the particle data book 

- is to compute the coupling at a reference scale, conventionally taken as 
the Z° mass. 

Figure shows an overview of some typical experimental measurements 
of the QCD coupling ag in the MS renormalization scheme at different scales 
0. In order to compare results obtained at different scales, one evolves the 
coupling to the Z° mass by employing the renormalization group equation in 
the MS scheme, computed in perturbation theory up to 4-loop. In the above 
reference, a world average of 

a^iMzo) = 0.1184 ±0.0031 

is obtained. We have plotted the running MS coupling for this average and 
its error as continuous respectively dashed linesQ. 

It turns out that the experimental results for as{Mzo) obtained from 
measurements at different scales agree well with each other. But it is also 

^The running MS coupling has been computed using the RunDec program from 
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Figure 1.1: Experimental values for ag, compared with the world average. 



clear that at some scale, ag is not a small expansion parameter anymore and 
perturbation theory must break down. At lower energies and larger distances, 
confinement occurs and non-perturbative contributions like instantons are 
known to play an important role. But not only does perturbation theory 
fail in qualitatively describing these phenomena. Even at scales where it 
apparently converges, it is impossible to specify the systematic error caused 
by using perturbation theory up to a low finite order for the evolution of 
the coupling to the reference scale. A renormalization scheme that is only 
perturbatively defined is not suitable at low energies. 

Still, one can investigate the low energy properties of QCD with the 
help of computer simulations. In this context, QCD is regularized by a 
hypercubic lattice. After the free parameters in the theory have been fixed 
by requiring certain quantities - like hadron masses - to take their physical 
value, further observables can be measured in Monte Carlo simulations and 
compared to experimental results. While in principle, this method provides 



3 



a non-perturbatively defined scheme which is suitable in a certain low energy 
range, the size of lattices which can be simulated in practice puts a tight 
constraint on the energy scale which can be reached. 

So while different renormalization approaches can be used to study differ- 
ent ranges of scales, it is not a priori clear how the parameters in the different 
schemes are connected to each other. For a test of QCD at all scales, it is 
indispensable to use a scheme which is defined non-perturbatively, practi- 
cally tractable and which does not use any uncontrolled approximations. A 
method for this was proposed by Liischer, Weisz and Wolff 0]. 

The fundamental concept of this approach is that one does not need to 
accommodate all relevant scales on a single lattice. Instead, one uses a finite 
volume scheme where the coupling runs with the space-time volume. The 
coupling can be tracked recursively along increasing scales using the step 
scaling function, which describes the change of the coupling under discrete 
changes of the scale. The step scaling function at a single point can be 
determined by simulating lattice pairs with decreasing lattice spacing and 
extrapolating to the continuum. No large scale ratios occur within each lat- 
tice pair, and the extrapolation to the continuum can be done with relatively 
moderate lattice sizes. 

This strategy can be used for any asymptotically free theory. First tests 
were made with the nonlinear 0(3) model in two dimensions [Q. Later it 
has been generalized to pure SU(2) gauge theory and pure SU(3) gauge 
theory within the framework of the ALPHA collaboration. While in 
principle, there is considerable freedom in the precise choice of boundary 
conditions and definition of the coupling, for non-abelian gauge theories a 
special choice has solidified where the theory is defined on a cylinder with 
Dirichlet boundary conditions in the time direction. The fields at the lower 
and upper boundary induce a classical background field on the cylinder. A 
coupling can then be defined as the response of the system towards changes 
of the background field. The partition function can be quantum mechanically 
interpreted as the propagation kernel for going from the initial configuration 
at the lower boundary to the final configuration at the upper boundary. 
For this reason, this renormalization scheme is also dubbed the Schrddinger 
functional scheme. 

In the context of QCD, the ALPHA programme has been fully imple- 
mented in the SU(3) pure gauge theory, which can also be understood as 
the quenched approximation to QCD. This includes several aspects. At low 
energy, a reference scale is set that allows to express quantities like the A 
parameter in physical units. The Sommer scale ro [0, ^ is derived from the 
effective potential between static quarks. It can be determined experimen- 
tally by measurements of charmonium and bottomonium bound states. At 
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intermediate scales, pairs of lattices with decreasing lattice spacing were sim- 
ulated and the step scaling function was then extrapolated to the continuum. 
Finally, when a scale is reached where perturbative behavior can be shown 
to set in, one applies perturbation theory to compute the A parameter in the 
Schrodinger functional scheme. A 2-loop calculation is required to confirm 
that no uncontrolled error is introduced in this step 0. 

In the quenched approximation, the fermion determinant in the partition 
function is formally set to a constant. Physically, this corresponds to freezing 
the dynamics of quarks and neglecting their vacuum polarization effects. 
The sole motivation for this approximation is the radically reduced cost of 
computer simulations. While this model has turned out to yield surprisingly 
good results in many areas like hadron spectroscopy and matrix elements, 
the quenched approximation is of course not a substitute for full QCD. In 
particular, certain phenomena like string breaking or the rj — r]' mass splitting 
are absent. Quantitatively, systematic deviations from experiment up to 10% 
are seen in the hadron spectrum |]10| . 

Meanwhile, the ground has been layed out for the extension of the AL- 
PHA programme to a pair of massless dynamical fermions. First results 
for the running coupling have been published in []11|. The main practical 
problem is the high cost of numerical simulations with dynamical fermions. 
Therefore, the range of simulated lattices does not yet reach very close to 
the continuum limit, and simulations on larger lattices are desirable. Also, 
an element still missing in the programme is the determination of a physical 
scale as a substitute for r^. 

As the efficiency of Monte Carlo algorithms for dynamical fermions de- 
creases with a high inverse power of the lattice spacing, it is not possible 
to perform simulations arbitrarily near to the continuum limit. In order to 
extrapolate results to vanishing lattice spacing, one wants to accelerate the 
convergence to the continuum as much as possible. Different solutions for 
this problem have been invented. The aim of the perfect action approach is 
to apply theories that yield continuum results already at finite lattice spacing 
JT^ . The theoretical background for this method is Wilson's renormalization 
group. In a computer implementation, approximations have to be made to 
parametrize the action. Another problem is that the fixed point actions used 
today are not "quantum perfect", i.e. they yield continuum results only at 
vanishing value of the gauge coupling. 

A complementary approach is the Symanzik programme. The idea here 
is to cancel lattice artefacts order by order in the lattice spacing a. This 
is done by adding certain - in the language of the renormalization group 
irrelevant - terms to the action and to operators and adjusting their coef- 
ficients appropriately. In practice, the full non-perturbative determination 
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of improvement coefficients is a big task. Therefore currently only 0(a) im- 
provement is feasible, i.e. observables converge to the continuum limit with 
O(a^) artefacts. 

Symanzik's improvement programme is based on statements that are valid 
for asymptotically small lattice spacings and are derived from perturbation 
theory. In particular, one does not know when higher order terms in a become 
negligible and the asymptotic behavior sets in [|l^ . 



In this thesis, we are going to study the approach to the continuum limit 
in the Schrodinger functional and test whether perturbative expectations 
about lattice artefacts hold. The aim of this investigation is to put the study 
of the running coupling for dynamical fermions on a firmer ground and justify 
the used continuum extrapolation and its error estimate. Since full QCD is 
so notoriously costly to simulate, we study the step scaling function in a 
different theory which also goes beyond the quenched approximation. By 
setting the number of quark flavors to A^f = —2, we obtain a Yang-Mills 
theory coupled to a bosonic spinor field. This theory, also known as the 
bermion model, has a local interaction in terms of Bose fields and is therefore 
much cheaper to simulate. 

Another topic which we treat with the bermion model is the massive step 
scaling function. As soon as one wants to study QCD with quarks that have 
masses, one has to cope with additional lattice artefacts which are especially 
sizable when the mass becomes comparable to the inverse lattice spacing. 
An attractive possibility for saving the evaluation of the running coupling 
from this danger is to make use of the decoupling of heavy quarks. Since 
according to the Appelquist-Carazzone theorem, quarks with masses very 
large compared to some scale do not contribute to the physics at this scale, 
one may drop these quarks from the theory in a certain energy regime. De- 
pending on the mass cutoff where one matches the theories with and without 
a heavy quark, one introduces a systematic error, which has previously been 



estimated in perturbation theory Here we study the dependence of the 
step scaling function in the bermion model on the mass, which should be an 
indication whether the decoupling follows perturbative expectations. 
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Chapter 2 
Theory 



In this chapter, we are going to introduce the basic concepts used in this 
thesis, and define the model used in later chapters. It goes without saying 
that we cannot discuss in depth all issues involved. For an introduction 
into gauge theories and perturbative renormalization we refer to [|15]. The 
framework and terminology of the lattice regularization of quantum field 
theories is layed out in [0. Non-perturbative renormalization and 0(a) 
improvement are discussed in [|17| and [|T^. These references also review the 
Schrodinger functional approach. 



2.1 Perturbative renormalization: a brief re- 
minder 

In perturbation theory, where transition amplitudes and other quantities are 
expressed by Feynman graphs, the need for renormalization arises in loop 
diagrams, which are divergent when evaluated in a naive way. The first step 
in the renormalization procedure is to construct a Lagrangian from the bare 
one Cb and additional counterterms, 

Cb ^ C = Cb + SC. (2.1) 

The bare Lagrangian develops the usual divergences, and 6C creates ad- 
ditional diagrams. Now one has to regularize the divergent diagrams. A 
suitable method for gauge theories is the dimensional regularization |T^, in 
which one continues the theory analytically in D = 4 — 2e dimensions. Its 
advantage e.g. over momentum cutoffs is that gauge invariance is manifestly 
retained (for parity conserving theories). Divergences then emerge as poles 
in the limit e = 0, while convergent integrals are unaffected. 
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The poles can be canceled by choosing the coefficients of the counterterms 
appropriately. This choice is not unambiguous, and possible choices differ by 
finite amounts. The precise set of rules for fixing the coefficients is called a 
renormalization scheme. 

In the minimal subtraction (MS) scheme only the poles are sub- 
tracted. It is a member of a larger class of schemes which are mass indepen- 
dent, i.e. the renormalization condition does not depend on the renormalized 
masses. A very popular member of this class is the MS scheme [^, in which 
further terms are subtracted that frequently appear in Feynman graphs. An 
example for a mass dependent scheme is the momentum (MOM) scheme p2[ , 
which is defined by imposing boundary conditions on the Green's functions 
in momentum space. 

In principle, it can happen for an arbitrary Lagrangian that different 
kinds of divergences appear in every order of perturbation theory. An infinite 
number of coefficients would have to fixed then. Such a theory would have 
little predictive powerQ. A theory is called renormalizable if only a finite 
number of counterterms is necessary. In such a theory, the renormalized 
parameters can be adjusted to take their physical values. Once this has been 
done, one can make predictions. 

Fortunately, with gauge theories we are in a comfortable position. In his 
famous articles [0, , 't Hooft has proven the renormalizability of unbroken 
and broken non-abelian gauge theories. The renormalized theory is gauge 
invariant, and the counterterm structure is quite simple in that all necessary 
terms are already present in the bare Lagrangian. 



2.2 Running coupling and masses 

In the course of dimensional regularization, one has to express dimensionful 
quantities by some scale n not present in the Lagrangian itself. Similarly, 
other regularization schemes introduce some cutoff scale in order to render 
integrals finite. Consequently, renormalized parameters unavoidably acquire 
a dependency on a renormalization scale. 

In the QCD Lagrangian, the bare coupling qq and the bare quark masses 
mo,i for the flavors i = 1 . . . Nf are the bare parameters of the theory. These 
parameters are fixed in a renormalization scheme such that a corresponding 
number of physical observables take their prescribed values. After this renor- 
malization, there is no freedom any more, and other renormalized parameters 
can be predicted. 

^It might still serve as a low-energy effective theory. The Fermi theory of weak inter- 
actions is an example for this p3| . 
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A natural question to ask is "How do the renormalized Green's functions 
change with the renormahzation scale when the bare ones are held fixed?" 
This question is answered by the renormahzation group equations. In the 
following, we assume a mass-independent scheme, i.e. a scheme where the 
definition of the renormalized parameters does not depend on the quark 
masses. In that case, the renormahzation group equations assume a simpler 
form. 

For the coupling, one is led to a description of the scale dependence by 
the Callan-Symanzik /^-function, 

/i^ = /5(^7R)- (2.2) 

The /^-function has an asymptotic expansion 

P{9r) ° -glibo + b^gl + b,g'^ + ■■■), (2.3) 
where the first two coefficients are universal, 

and the higher order coefficients depend on the scheme. For A^f < 16, the 
expansion of the /3-function obviously begins with a negative term, i.e. in the 
asymptotic high energy regime, the coupling decreases logarithmically with 
increasing energy. This property is known as asymptotic freedom. It refiects 
the observation that at high energies, quarks behave like free particles. 

In a way similar to the coupling, the scale dependence of the renormalized 
masses is described by the equations 

/W^^^ = T^(fi'R)"^R,i, i = l...Ni, (2.5) 
where the r-function has an expansion 

^(^r) = -giido + digi + d2g^ +■■■) (2.6) 
with a universal coefficient 

In the MS scheme, the f3- and r-functions are known up to 4-loop in pertur- 
bation theory |2B 
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The asymptotic solutions of the renormahzation group equations are 

1 



26olog(/x/A) 
[log(/i/A)]^°/^^° ■ 



(2.8) 



The integration constants A and Mj can be regarded as the fundamental 
parameters of QCD. This means, once these parameters are known, they 
uniquely fix all running parameters at all scales. 

The A parameter depends on the renormahzation scheme, but can be 
exactly transformed between different schemes through the 1-loop coefficient 
relating the couplings in those schemes. The Mj are scheme independent, 
and are therefore also called renormahzation group invariant quark masses. 

In order to obtain the fundamental parameters of QCD from the renor- 
malized parameters at a finite scale, one has to integrate the renormahzation 
group equations. This connection is given by the exact relations 



A = //(6o^R(A^)')-'^/'''°exp 



X exp 



dx 



2&o5'r(/^)' 

1 1 

+ 



(5{x) box^ 



Mi = mR,,(26o^R(//)2)-'^°/2^° X 



X exp 



dx 



r{x) _ do_ 
(5{x) bgX 



X 



blx 



(2.9) 



In practice, one inserts the (3- and r-f unctions to a finite order of perturbation 
theory here, provided that the perturbative behavior has already set in at 
the scale 



2.3 Non-perturbative renormalization 

As described in the introduction, an important aim is to compute the run- 
ning coupling at all scales. A natural way to achieve this is to start with 
a non-perturbative scheme that is based on a lattice regularized theory. In 
order to eliminate the bare parameters of the theory in favor of physical ones 
at low energies, one uses this theory to compute hadronic observables like 
the pion decay constant Fj^ and hadron masses. Then the coupling is evolved 
to high energies and compared to experiments via jet cross sections etc. The 
connection between low-energy hadronic schemes and perturbative energies 
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however involves scales very different from each other, thus imposing heavy 
demands on lattice simulations: on the one hand, one must choose the lattice 
cutoff away from the energy scale /x, in order to avoid large lattice arte- 
facts hampering an extrapolation to the continuum. The limiting quantity 
here is the energy scale at which the connection to perturbation theory is 
made, which should be e.g. around 10 GeV. On the other hand, the system 
size L should be large enough to avoid finite size effects. The relevant energy 
scale for this is the confinement scale at about 0.4 GeV in the quenched ap- 
proximation, or even the pion mass at about 0.14 GeV. Together, these 
constraints imply simulations on lattices with linear extent L/a ^ 70, which 
is difficult to achieve in practice. 



2.4 Strategy 

The Schrodinger functional scheme uses a trick to overcome the problem of 
widely disparate scales: instead of regarding finite size effects as a problem 
that is distorting physical states of the finite system compared to the infinite 
system, one considers the finite volume behavior of the system as origin of 
interesting observables. This is analogous - though not equivalent - with 
the computation of critical exponents in statistical systems, which can be 
extracted from the change of observables with the box size. 

In a finite-volume renormalization scheme, the running of the coupling 
with the energy scale is identified with the running of a coupling g{L) with 
the system size L = fi~^. One starts at a low energy scale Lmax which is fixed 
by requiring that the coupling takes some value, 

g{Lmax) = prescribed value. (2-10) 

The physical value of Lmax has to be connected to a physical scale by com- 
puting for instance in units of i^~ax- 

Then this coupling is traced non-perturbatively to energies high enough 
for perturbation theory to apply. Finally, one can use the relations (|2.9|) to 



calculate the A parameter. This can be transformed to the A parameter in 
the MS scheme with a 1-loop order calculation. 

The last step in this technique actually corresponds to a transformation 
of a small volume coupling to the infinite volume coupling a^jg. Since these 
couplings are in one-to-one correspondence, this matching is entirely justified 
if they are just small enough for perturbation theory to be applied. 

The important point to notice here is that although finite- volume quanti- 
ties have been used in the scale evolution of the coupling, there is no reference 
to the volume in the final result anymore. This is achieved by the standard 
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assumption that QCD physics is described by the same Lagrangian regard- 
less of the context in which it is used. In particular, the finite size effects 
of the theory are predicted by the Lagrangian, as is for example the energy 
dependence of scattering processes. 

Now we can introduce an additional concept used to evolve the running 
coupling. As introduced before, this evolution is given by the renormalization 
group equation ( p.2|) . Thus, the coupling at a scale 2L is related to the 



coupling at L through a unique function, called the step scaling function^, 

f{2L)=a{g\L)). (2.11) 

With the help of this function, the coupling can be computed recursively at 
scales 2~'^Lmax beginning at a starting point Lmax- 

The step scaling function can be computed with the help of Monte Carlo 
simulations. First, one simulates a lattice with L/a lattice sites in each 
direction and tunes the coupling to the desired value. Then one simulates 
a lattice with twice the extent, 2L/a, using the same bare parameters. The 
coupling obtained from this simulation is an approximation S(m, a/ L) of the 
step scaling function a{u). We expect this to have 0(a) lattice artefacts. 
With the 0(a) improvement programme discussed later, we assume (apart 
from logarithmic corrections), 

a(M) = S(M,a/L) + 0(a2). (2.12) 

Therefore, by computing S(m, a/ L) for a number of lattice sizes L/a, one can 
obtain (t{u) by extrapolating to the continuum. 

A notable property of this recursive scheme is that by identifying yU = L"^, 
the constraints on the required lattice sizes are significantly relaxed. Instead 
of 

L > m-^ > > a, (2.13) 

we only need 

L > a (2.14) 

for an extrapolation to the continuum limit. 

The step scaling function can be understood as an integrated version of 
the /3-function for finite changes of the scale. The couplings at L and 2L are 
related through an integral, 

log2=/ -jT = - ^ 2.15 

JL L' Jg{L) P{X) 

such that the perturbative expansion of a can be derived from the expansion 
( |2.3| ) of the /3-function, 

a{u) =u + SqU^ + siu^ + S2u'^ + O(m^), (2.16) 
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where 



So = 2 log 2 60 

si = (2 log 2)25^ + 2 log 2 61 

S2 = (21og2)363 + (21og2)2^6o&i + 21og2 62. (2.17) 
In addition to the truncated n-loop expansion of the step scaling function 

n 

a-i°°P(n)=n + ^Sfc_in^+\ (2.18) 

k=l 



we define another perturbative step scaling function as the solution of (|2.15|) 
with a truncated /3-function, 

log2=/ dx[Y.h-ix^+^] . (2.19) 

\fc=l / 

The functions cr"^'°°P(M) and o-"~^°°p(u) differ by terms of order u""*"^ from 
each other and from the exact function a{u). In practice, 
be a better approximation for the exact function a. The difference between 
both variants may be used as an estimate of the neglected higher order terms. 



2.5 Improvement 

While a finite size technique nicely solves the problem of disparate scales, it 
still inherits a problem from QCD that makes the precise measurement of 
observables difficult. In order to determine continuum quantities, one must 
compute them for several values of the lattice spacing a and then extrapolate 
to the continuum limit a — >^ 0. Unfortunately, this is also the limit in which 
Monte Carlo algorithms suffer from critical slowing down, i.e. the cost of 
conventional algorithms grows with a rate proportional to at least in 
the quenched approximation and typically more than a"'' with dynamical 
fermions. As a consequence, one can not get very close to the continuum limit 
with currently available hardware, and it is not obvious that measurements 
are already in a range in which e.g. an asymptotic behavior linear in a can 
be assumed for an extrapolation. 

Symanzik found that lattice theories are equivalent to effective continuum 
theories that make the cutoff dependence explicit, order by order in a. This 
means, the theory corresponds to an action 

S'eff = / d'^x {Co{x) + aCi{x) + ■ ■ ■} (2.20) 
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and effective lattice fields are 



+ a(j)i + ---. (2.21) 



Here, Cq stands for the naive continuum Lagrangian (in our case, the QCD 
one) and the higher Ck are linear combinations of local operators also called 
counterterms. The set of possible operators in each order is restricted by 
the demand that they must have dimension 4 + k and be invariant under 
the symmetries of the lattice theory. The improvement coefficients of these 
terms are functions of the bare couplings and are not known a priori. 

Following this observation, Symanzik suggested to use improved lattice 
actions and improved local fields in order to reduce the size of lattice artefacts 



and accelerate the rate of convergence to the continuum p8|, |29|. Meanwhile, 
it is common in the ALPHA collaboration to use 0(a) improvement in QCD, 
such that cutoff effects linear in a are removed in all on-shell quantities. 
In this context, strategies have been developed to compute improvement 
coefficients non-perturbatively |3y]. Several improvement coefficients have 



been determined perturbatively |32| and non-perturbatively |^ |3^, 



36| 



For lattices without boundaries, the only necessary counterterm is the so- 
called clover, or Sheikoleslami-Wohlert term The boundary conditions 



used in the Schrodinger functional approach are not translational invariant. 
Therefore, additional counterterms have to be added at the boundary. A 



detailed analysis can be found in |38 



In the following sections, we shall assume degenerate quark masses and 
denote the bare mass with rriQ. In the lattice regularization, chiral symmetry 
is explicitly broken. As a consequence, the quark mass gets an additive 
renormalization depending on the lattice spacing. One defines the critical 
mass rndgo) as the bare mass for which the renormalized mass vanishes. A 
subtracted mass is then defined as = rriQ — rrtc. Furthermore, in the 
improved theory, a rescaling of the bare parameters by factors 1 + O(amq) 
is necessary [Q. The general connection between bare and renormalized 
parameters is then given by 



qI = glil + bgam^) gl = g^Zgiglafi) 

rhq = mq(l + 6mamq) ttir = mqZ^{gl,afi). 



(2.22) 



The coefficients 6g, are again improvement coefficients which are indepen- 
dent of the renormalization scheme. 
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2.6 Model 



In the following, we will describe the model and the imposed boundary con- 
ditions. For undefined notations, we refer to appendix |^. We set up our 
theory on a four- dimensional hypercubic lattice with lattice spacing a and 
extent T in the time direction and L in the space directions. Normally, we 
set T = L. On the links between neighboring sites x and x + afi (where fi 
denotes the unit vector in direction = 0, 1, 2, 3) lives a gauge field that is 
represented by SU(3) link variables U{x, /i). Furthermore, on the lattice sites 
reside A^f fiavors of mass degenerate fermionic quark fields ipf{x) which also 
carry Dirac and color indices. We do not specify A^f at the moment. Later we 
will consider the theory in which A^f is continued to negative numbers. This 
has to be done after the integration over the quark fields has been performed. 



time 




space 



Figure 2.1: Schrddinger functional boundary conditions. 



We think of the lattice as being wrapped up on a cylinder, i.e. for the 
gauge fields we impose periodic boundary conditions in the space directions, 
while the quark fields obey periodic boundary conditions up a to a phase 
factor exp(z6') Q. 

The gauge field at the boundary takes the form 



U{x, /c)Uo=o 
U{x, k)\^o=T 



exp(aC) 
exp(aC"). 



(2.23) 



This still leaves open a wide range of possibilities. First, one imposes the 
restriction that the matrices and C^. are diagonal and independent of k, 



Ck 



L 



( 01 


V 












L 



( 4>[ 



V 












(2.24) 
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with real (pk and 01 + 02 + 03 = <Pi + 02 + 03 = such that the corresponding 
hnk variables are in SU(3). 

We choose the boundary fields as a line through the point "A" as discussed 
in p, parametrized by a variable r/, 

01 = ^ - 3 = -ry - TT 

1 , 1 TT 

02 = - 2'n + ^ (2.25) 

1 TT ,/ 1 27r 

03 = -2^+3 '^3 = + 

The boundary fields on the opposite sides of the cylinder are chosen such 
that the partition function is invariant under a combination of a time refiec- 
tion, charge conjugation and a central conjugation. 

A solution of the field equations for the link variables then is V{x, fi) = 
exp(aS^(xo)) with 

5o(xo) = 0, Bkix) = [xoCl + (T - Xo)Ck]/T. (2.26) 

In it has been shown that for SU(3) and for the lattice sizes of interest, 
this solution is the unique minimum of the action introduced in the next 
section, in an environment of t] = 0. Hence, we use the notion that the 
boundary field enforces a constant color-electric (classical) background field. 

The boundary conditions for the quark fields are discussed in detail in 
JT[. The boundary quark fields serve as sources for fermionic correlation 



functions. They are set to zero after differentiation, 

P+iIj{x)\^o=o = p(x), P_V(a;)Uo=r = p'(x) 

ij}{x)P.U,=o = p(x), ^{x)P+U,=T = p'(x). (2.27) 

Here we have used the projectors P± = |(1 ± 70). For notational reasons - 
namely in order to avoid referencing undefined fields in the Dirac operator 
-, we furthermore extend the time direction beyond the boundaries and set 

ipix) = ^{x) = for a;o < and xo > T (2.28) 

and 

P+tjj{x)\cco=T = P-i^{x)\xo=0 = 

^P{x)P.\x,=T = i^{x)P+U,=o = 0. (2.29) 
Analogously, all link variables outside the cylinder are set to the unity matrix. 
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The Schrodinger functional is the partition function of this system and 
is defined as a path integral over all gauge and quark fields that fulfill the 
given boundary conditions, 

Z[C\ C] =e-^ = J D[U]D[ij]D[ij]e-^^^'^''f'\ (2.30) 

D[U] denotes the measure Ilx,ixdU{x, /i), and D[i/j] stands for the product 
over sites, Dirac and color indices UxDcdipDci^)- The expectation value of 
any product of fields is now given by 

(O) = (4 / D[U]D[ilj]D[ilj]Oe-^^^'^'^^\ . (2.31) 

Note that possible choices for O include the variational derivatives 



5p(x) ' 5p(x) 

^'1=') - sky - -ikr ^'-''^ 

These act on the Boltzmannian and have the effect of inserting ■^(a;) terms 
near the boundary. 



2.7 Coupling 

We can interpret the effective action as a function of the background field, 

r[B] = - log Z[C',C] (2.33) 

The background field can be varied by changing the parameter 77. We define 
a derivative for F as the response to a change of the background field, 

riB] = (2.34) 

It has a perturbative expansion 

r'[B] = lr[, + ^^ + g^r^ + ■■■. (2.35) 

9o 

The renormalization properties of the Schrodinger functional have been 
studied in perturbation theory. Symanzik has proven the renormalizability of 
the 0^ theory with Schrodinger functional boundary conditions to all orders 
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of perturbation theory ||3^. For QCD, the renormahzabihty has been estab- 
hshed up to 2-loop for A^f = and up to 1-loop for dynamical fermions 

The important result is that the effective action is finite after the bare 
coupling has been eliminated in favor of a renormalized coupling, and the 
fermionic boundary fields have been rescaled with a renormalization factor. 
We infer that T' is itself suitable as a renormalized coupling. It is normalized 
such that its perturbative expansion begins with the bare coupling at tree 
level. The Schrodinger functional coupling g is then defined as 



r[B] 



r)=0 



(2.36) 



The normalization factor is calculated as 



T',[B] = 12(L/a)2[sin(7) + sin(27)], 7 = -7r{a/Lf 



(2.37) 



It is clear that is an inherently non-perturbative definition of the cou- 
pling, as desired. Its only dependence on an external scale is on the system 
size L. We can therefore speak of it as a coupling running with L. The QCD 
coupling Q;s(/i) at the scale is related to it by 



a.i,) = (2,38) 

On the practical side, the coupling defined in this way can easily be 
computed in Monte Carlo simulations as the expectation value 



T'[B] = (^). (2.39) 




In order to completely define the scheme for A^f 7^ 0, one complements this 
definition with the condition that the coupling is taken at vanishing current 



mass mi. The definition of the current mass will be introduced in section . 
This condition can safely be imposed because the Schrodinger functional is 
known to have a mass gap of order 1/T in perturbation theory. The step 
scaling function is then defined as 

E(u,a/L) =f(2L) . (2.40) 
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2.8 Action 



The action is given as the sum S'[f/, t/^, ■?/'] = Sg[U] + [t/, -i/^, ■?/'] of a pure 
gauge term and the fermionic action. For the pure gauge part, we use the 
Wilson plaquette action modified by 0(a) improvement, 

s,[u] = \j:^ipmi-u{p)) (2.41) 

9o p 

Here, U {p) denotes the parallel transporter around a plaquette p, 

U{p) = U{x, ^i)U{x + a/t, v)U\x + av, ^)U\x, v) (2.42) 

and the sum extends to all oriented (i.e. left-handed and right-handed) pla- 
quettes. In this thesis, we will alternatively express the bare coupling by 

P = Q/gl 

On a system without boundaries, the Wilson action already reaches the 
continuum limit with O(a^) artifacts and the weights are w{p) = 1 for all 
plaquettes. In our Schrodinger functional setup however, 0(a) improvement 
is achieved by adding a counterterm at the boundaries. The addition of this 
term is equivalent to a modification of the weights such that 

w{p)=c{go) (2.43) 

if p is a time-like plaquette attached to a boundary plane. In all other cases 
w{p) = 1. The improvement coefficient q is only known perturbatively. Its 
2-loop value depends quadratically on A^f and has the form 

ct(^o) = 1 + (-0.08900(5) + 0.0191410(l)iVf) gl 

+ (-0.0294(3) + 0.002(l)A/'f + 0.0000(l)A/'f2) g^ + 0{g^). (2.44) 

For the quark fields, we start with a fermionic action of the form 

Sf[U,^,tlj] =a^^^/'(a;)(D + mo)^/'(x) (2.45) 

X 

with the Wilson-Dirac operator 



^EK(V; + V^)-aV;vJ. (2.46) 



D 

The derivative operators are given by 

V^ip{x) = - X^U{x, fi)ilj{x + afi) — ipix) 

V*ip{x) = - 'ip{x) — \* U\x — afi, (x)ip{x — afi) . (2.47) 
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Figure 2.2: Graphical representation of the products of hnks contributing to 
the clover term. The point in the middle is x. 



As a difference to the conventionally used operators, they include phase fac- 
tors 



A, 



^ie^a/L 



6r. = 0, -TT < 01, < n. 



(2.48) 



As can be easily seen, these factors are equivalent to boundary conditions 

^{x + Lk) = e'^'^ipix), ipix + Lk) = V^(x)e-*^^- (2.49) 

in the space directions. However, in an implementation on the computer, it 
is simpler to "distribute" this phase on the difference operators and impose 
strict periodic boundary conditions on the fields. 

In [|T1|, it is argued that the choice of O^. should be guided by practical 
considerations. The lowest eigenvalue of the Dirac operator on the classical 
background field varies with 6k- In a Monte Carlo simulation, the square of 
the Dirac operator is inverted frequently, so that a small condition number 
can improve its performance. An optimal condition number has been found 
around the value 6^ = 6 = 71/5 which we use here. 

In the quark sector, 0(a) improvement can be implemented by adding 
certain terms to the Dirac operator. One is a bulk term 



SD,ij{x) 



F^^{x)tp{x) 



also known as Sheikoleslami-Wohlert term. In this term, 

^ 1 



(2.50) 



(2.51) 



is the lattice definition of the field strength tensor. The term Q^^^ is visualized 
in figure |2.2| and is explicitly given by 



Qfiu{x) = [^U{x, jj,)U{x + a/t, z/)f/^(a; + az>, /x)[/^(a;, u) 
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+U{x, iy)U\x + au — ap,, ^)U\x — ajl, i')U{x — ap,, /i) 

+?7^(s — a/i, ii)U\x — au — afi, u)U{x — az> — a/i, iJ,)U{x — au, u) 

+U\x — au, v)U{x — az>, ij,)U{x — az> + afi, v)U\x, (2.52) 

As with Schrodinger functional boundary conditions, we do not have 
translational invariance in the time direction, 0(a) improvement here re- 
quires an additional term 

5D^ij{x) = (ct - l)-{5,,,aVP{x) - U\x - a6)P+tlj{x - aO)] 

+4o,r-a[V^(x) -f/(x,0)P_V^(x + aO)]}. (2.53) 

The coefficient Ct is known perturbatively ||32|| , 

ctigo) = 1 - 0m795{2)gl + 0{g^o)- (2-54) 

At this point we want to introduce some further notations. Sometimes it 
is useful to separate the integration of quark and gauge fields. We therefore 
write the expectation value above as 

(O) = ([0]p)g. (2.55) 

Here, (. . .)g denotes the gauge field average with respect to the distribution 

det{D + 5D + mo)exp{-Sg[U]). (2.56) 

The fermionic expectation value [. . .]f can be represented by a generating 
functional. We refer to for a detailed discussion and only list the relevant 
results here. The quark propagator S{x, y) on a given gauge field is defined 
as the solution of 

{D + 5D + mo)S{x,y) = a~^5{x,y), 0<xo<T. (2.57) 

with appropriate boundary conditions. It fulffils 

S^x,y)=j,S{y,x)j,. (2.58) 

Furthermore, one can define a propagator H{x) "from the lower boundary 
to point x" through 

{D + 5D + mo)H{x) = a-^S{xo, a)dtU^{x - aO, 0)P+. (2.59) 
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Fermionic expectation values [. . .]f necessary for the computation of corre- 
lation functions can be expressed in terms of these propagators. The basic 
boundary-bulk 2-point functions are 

y 

«'E[C(y)V^(^)]F = H^{x). (2.60) 
y 

One proceeds similarly for the upper boundary. 



2.9 Mass 

For the definition of a quark mass, we use chiral symmetry to derive a relation 



between correlation functions, following [^. In the continuum, the PCAC 
(partially conserved axial current) relation c?^Aj^ = 2mP^ is a special case of 
the chiral Ward identity and connects the isovector axial current 



A^(x) = ^(x)7^75y^(a:) (2.61) 



and its associated density 

P'^(x) =7A(x)75yV^(x). (2.62) 

The matrices Tq, a = 1 ... 3 are the Pauli matrices and act on the flavor in- 
dices. On the lattice, we demand that the PCAC relation holds for renormal- 
ized quantities. This means, we use it to derive relations between correlation 
functions and require these to converge to the proper continuum limit. Since 
we are going to implement improvement in the action, we also have to use 
improved operators in order to get a mass definition that has only O(a^) 
artefacts. An analysis shows that this amounts to an addition of a term 

6A; = CA^^P;. (2.63) 

to the axial current operator. Here, we use the notation 9^ = 1/2(5^ + 9*) 
for the average of forward and backward derivative on the lattice, ca is a 
further improvement coefficient. It has been computed to 1-loop order in 



perturbation theory 31 



ca(^o) = -0.00756(l)^o'- (2-64) 



Non-perturbative data in the quenched approximation is also available ||33| 



but not used in this work. With this improvement term, the renormalized 
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axial current and its associated pseudo-scalar density are given by the ex- 
pressions 

(Ar)^ = ZA(l + 6Aamq) [A^ + CAa^P;^ 

{Pk)1 = Zp(l + 6pamq)P;. (2.65) 

Here, Za and Zp are renormalization factors. While the former depends only 
on the bare coupling, the latter is scale dependent. 

We now define a renormalized mass through the relationship 

{d,{A^)l{x)0) = 2m{{Pnnx)0) + O(a^), (2.66) 

where O may be any product of improved renormalized fields located at non- 
zero distance from x. Furthermore, we define a current quark mass by the 
relation 

[{d^A^ + CAad;d^P''}0'') = 2m(P'^0'^), (2.67) 
where is the operator 

(^" = «'EC(y)75^C(z). (2.68) 

We sum over all space-like x and define bare correlation functions f\ and /p 
as 

/a(^o) = -^E^(^S(^)C(y)75yC(z) 
/pW = -^E^(n^)C(y)75yC(z)). (2.69) 
Similar correlations functions can be defined at the upper boundary, 

-,6 



/a(T-Xo) = |^E^(^S(^)C'(y)75yC'(z)] 
MT-xo) = ^E^(n^)C'(y)75yC'(z)). (2.70) 



x,y,z 



A time-dependent mass is then obtained as 
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In practice, m(xo) turns out to have large lattice artifacts at the boundaries, 
with a plateau in the middle. Thus, choosing xq to be in the middle of the 
lattice is a good idea, 

{1TL (^^^ for even T/a 

1 (^rn{^) + m{^^)) for odd T/a. ^^'^^^ 

This is a quantity that can be actually measured in a simulation and plays 
the role of an unrenormalized mass. We also use the notion of a PCAC 
or current mass. In contrast, a computation of the renormalized mass re- 
quires knowledge about the renormalization factors. By combining (p.66|) 
and (|2.67|) , we find the relation 

_ Za(1 + bAam^) ^ 

■m= y \mi + 0(a). (2.73) 

Zp[l + bparriq) 

This relation also reflects the knowledge that the current quark mass has 
no additive renormalization. The definitions of the correlation functions still 
contain fermionic expectation values. For a practical measurement in a sim- 
ulation, these have to be integrated out. From an application of Wick's 
theorem, (p.60|) and ( p. 581) , one gets 

/a(xo) = -|^E^(Tr{[C(z)V^(x)]F7o75[V^(x)C(y)]F75})^ 

^ x,y,z ^ 

= -^^E(Tr{i^^(x)7oi^(a:)})^, (2.74) 

where the trace is over Dirac and color indices and not over flavor. Analo- 
gously, the correlation function /p can be written as 

/p(^o) = E (Tr {h\x)H{x)})^ . (2.75) 

With the help of the equations (|2.74|) , (p.75|) together with (|2.59| ) one can 
compute /a and /p in a Monte Carlo simulation. 
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Chapter 3 



Performance of algorithms in 
pure gauge theory 



3.1 Motivation 

From an algorithmic perspective, the simulation of gauge theories with fermi- 
ons is fundamentally different from the simulation of pure gauge theories. 
A pure SU(A^) theory has a local action. Therefore, it is natural to ap- 
ply local algorithms for its simulation, unless such algorithms fail in effec- 
tively decorrelating successive configurations. The Hybrid Overrelaxation 
algorithm (HOR) has become a classical method for this purpose. 

On the other hand, an important property of QCD with fermions is the 
Pauli principle, which in the path integral formalism leads to a formulation 
with Grassmann variables. In practice, Grassmann variables cannot easily 
be handled numerically, and thus the common way of treating them is to 
integrate the path integral analytically and represent the resulting fermion 
determinant as a path integral over boson fields also called pseudo fermions. 

In terms of these boson fields, the action is highly non-local. This means 
that any algorithm consisting of a sweep over the lattice and updating links 
locally needs an amount of work at least quadratic in the lattice volume for 
a complete update of all link variables. In general, two groups of algorithms 
have been proposed to solve this problem. One is based on using molecular 
dynamics and includes the Hybrid Monte Carlo (HMC) method and variants 
thereof. The other is the MultiBoson (MB) method. 

The MultiBoson algorithm has been proposed by Liischer ETJ. The basic 



idea is to transform the theory into a local bosonic theory. To this end, the 
fermion determinant - here for two degenerate flavors - is approximated by 
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a polynomial, 

det Q2 = jim [det P„(g')]"\ Q = 75P + ^0), (3.1) 

in an interval that covers the spectrum of the matrix . Using the her- 
miticity property of Q, one can express the approximate fermionic action 
by 

/n 
D[0]D[0t]D[0]exp{-a^^^ 

a; k=\ 

(|(Q-/^.)0.(a:)P + z/,2|0fc(a;)p)}, (3.2) 

where the constants yUfc and V}^ are given by the roots of the polynomial. 
There are n boson fields 0^ which carry color indices. When Q is local, these 
fields may be updated with algorithms known from pure gauge theory. The 
deviation of the polynomial approximation from the exact measure may be 
corrected for instance by reweighting. 

It has turned out that in practice, a sufficiently precise approximation 
requires large n, i.e. a large number of boson fields. This seems to compensate 
the advantage of having to simulate only a local action. As an additional 
disadvantage, it is difficult to simulate the action (|3.2| ) when the fermion 
matrix includes a clover term, because in that case the matrix (Q — /i^)^ 
contains terms beyond the hypercube around the lattice site. A review of 
the MultiBoson technique can be found in p2|. 



The standard algorithm for the simulation of a pair of dynamical fermions 
is the Hybrid Monte Carlo algorithm |^ . It generates new configurations by 



computing discretized molecular dynamics trajectories. An acceptance step 
makes this algorithm exact. Hybrid Monte Carlo is not applicable to either 
odd fiavor simulations with Wilson fermions or simulations with (A^f = 4) 
staggered fermions. In those cases, it is possible to apply e.g. the R algorithm 
H], which does not include an acceptance step. It therefore has systematic 



errors dependent on the step size, and one has to extrapolate to zero step 
size in order to get reliable results. 

It is clear from the preceding discussion that the development of the Hy- 
brid Monte Carlo and other modern algorithms is mainly motivated by the 
wish to simulate dynamical fermions efficiently. Since the characteristics of 
these theories are very different from pure gauge theory, one cannot transfer 
any experience about the relative merits of different algorithms from dynam- 
ical fermions to pure gauge theory, and vice versa. In this part of our work, 
we first compare different algorithms in pure gauge theory. This gives results 
about the overhead of HMC compared to HOR before dynamical quarks are 
included. 
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3.2 Hybrid Overrelaxation 



A standard algorithm for the simulation of pure gauge theory is a combi- 
nation of overrelaxation and heatbath steps. Both are local updates. The 
overrelaxation algorithm is a microcanonical update which is known to decor- 
relate configurations very efficiently. It is not ergodic though. The heatbath 
step is necessary in order to guarantee a simulation of the correct (non- 
microcanonical) statistical ensemble. 

3.2.1 Heatbath 

In general, the notion of a heatbath algorithm is used for Markov chains in 
which each new configuration is chosen from the canonical distribution, i.e. 
independent from the previous configuration, 

P{U' ^U)<x exp{-S[U']). (3.3) 

For arbitrary actions and arbitrary fields it is a non-trivial task to find an 
efficient way for generating such an ensemble. We begin here with an SU(2) 
theory. The SU(3) case can be connected to SU(2) by updating subgroups 
and will be discussed later. 

In the following, we drop the arguments x and ^ so that it is obvious that 
we have to cope with the distribution of a mere SU(A^) matrix. 

SU(2) 

We can formulate the goal of finding a heatbath update for SU(2) as finding 
an algorithm which produces the distribution 

dPiU') oc e5^(^'^')cif/', (3.4) 

where W is a. real multiple of an SU(2) matrix W. This is e.g. the case 
for an SU(2) theory with a Wilson action. In that case, det > 0, so that 
we can write W = V det W W. This distribution factorizes when the SU(2) 
variables are expressed in the quaternionic representation, 

U' = cto + ^'^i'^i 

W = wo + iwjaj. (3.5) 

From the invariance of the Haar measure under the multiplication with group 
elements and (|A.11|) follows 
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1 



OC ^5(n^ - l)ef""'Jl - daod'^n. (3.6) 



In the last step, we have set aj = nj\Jl — Oq and p = ^Wq + WjWj. 

This means, we have to generate a flat distribution in and a uniform 
distribution on the surface of a three dimensional sphere for n. These produce 
a matrix U' , which is according to (|3l6|) to be multiphed with W, resulting 
in a new value for the link variable. 

Since the angular element can be parametrized as dQ = d{cos6)d(j), the 
n vector can simply be generated by drawing a pair {ui, U2) of numbers with 
a uniform distribution in [0,1) and setting 

ni = 1 — 2ui 

n2 = \Jl — nl cos(27r'U2) 



ns = Y 1 — sin(27ru2). (3.7) 

A method to generate the distribution in oq has been described by Fabri- 
cius and Haan in [45] and shortly later by Kennedy and Pendleton [46|. The 
basic idea is to generate a variable y according to 

Piy)^expi-y)^e{y) (3.8) 

and to set Oq = 1 — yp- The distribution of the generated Oq is given by 

P{ao)dao = P{y)dy 

^P{a,) = P{y{a,))^ 

oc ef'Wl- ao9{l- ao). (3.9) 

The difference to ( |3.6|) is a factor y^l + qq 9{1 + Qq), which can be accom- 
modated by an acceptance step: the proposal from ( |3.9| ) is accepted if a 
flat random number c fulfills 2c^ < 1 + Oq. This corresponds to a Pi{c) = 
9{2c^ — 1 — ao) and therefore 

dcP{ao)Pi{c) = / c/ce'"'Vl-ao^(l -ao)^(l + ao) 

= e^"«v^r^^(l -ao) 

oc P(ao). (3.10) 

In the form proposed by Fabricius and Haan, one maps a uniform dis- 
tribution to the needed distribution in oq by a pretty complicated function 
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involving logarithms of logarithms, which is difficult to calculate in a suffi- 
ciently precise way. 

Here we use a slightly different method discussed in E^, which combines 



two distributions in order to obtain the desired one, and therefore requires 
twice the number of random numbers per update. In order to create a value 
for qq, one generates a number a with a Gaussian distribution 

Pi{a) oc exp{-a^)9{a) (3.11) 

and a number b distributed hke 



P2ib) (xexp{-b)9{b). (3.12) 

Then one makes a variable transformation y = a?+b and z = a. The numbers 
z are finally dropped. As for the Jacobian of this transformation, 

det|^ = l. (3.13) 

the combined distribution of y and z is 

P{y, z) = Piia{y, z), b{y, z)) x P2(a(2/, z), b{y, z)) 
= exp{-z^)9{z) X exp{-y + z^)9{y - z^) 
= eM~y)d{z)9{^-z). (3.14) 

Since the generated numbers z are dropped, the resulting distribution of y is 
obtained by integrating over z, i.e. one really gets the desired 

P{y) = J dzP{y,z) = ^exp{~y)9{y). (3.15) 

The distributions Pi and P2 are well known: the former is the Gaussian 
distribution explained in appendix |B|. The latter can be generated from 
uniformly distributed random numbers u by taking b = — log(l — u). 



SU(iV) 

We assume that the action has the form 

S[U] = -ReTr{f/(x,/i)0(x,/i)} + ---, (3.16) 

with a part that depends linearly on the link variable U{x,iJ,) G SU(A^) 
(iV > 3) that is updated, and a part that is independent. This form is fulfilled 
by the Wilson action, including the Schrodinger functional improvement term 
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in pure gauge theory. In this case, the term V{x,n) is proportional to the 
sum of the six "staples" of the link, 

S{x,jj,) = v)U{x + ai', ij)U\x + g/i, v) + 

U\x — az>, u)U{x — au, jj)U{x — az> + afi, z/)|. (3.17) 

If one of the link variables in the definition of S{x, fi) lies in a boundary plane 
it is to be multiplied with a factor Ct. 

The essential difference to the SU(2) case is that the sum of staples is not 
in general a real multiple of an element of the gauge group. This makes it 
non-trivial to factorize the distribution for U into several flat distributions. 

For the SU(3) case - and more generally SU(iV) -, the method described 
in the previous section was generalized by Cabibbo and Marinari |^ in terms 
of projecting to subgroups successively. In this approach, one chooses a set 
of SU(2) subgroups of SU(iV) with the property that no subset of SU(A^) 
is left invariant under left multiplication by JF, except for the whole group, 

^ = {SU(2)i, . . . , SU(2)„}, m>N-l. (3.18) 

This guarantees that by successively multiplying a given SU(A^) matrix with 
random matrices from these subgroups, one can reach any other SU(A^) ma- 
trix. So we are left with the task of giving the random matrices the correct 
distribution. Again, in order to simplify the notation, we do not explicitly 
display the indices x, /i. 

An update U' U starts with U^^^ = U and moves in m steps to = 
U'. The current link variable in each step is obtained by applying to the one 
from the previous step a member A^'^'^ E SU(2)fc of a subgroup. 

The matrices A^''^ are chosen according to the distribution 
dP(A,) - dA^'^ eM-S[Am^'-% 

In this work, we select as subgroups of SU(3) the three most natural possi- 
bilities, 




Ai = \ a2i a22 \, A 
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(3.21) 
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where a G SU(2). This in fact imphes one subgroup more than necessary 
for the condition explained above. It is more symmetric and may lead to 
better autocorrelation times than the minimum of two subgroups. To our 
knowledge, a detailed study on this topic has however not been done. 
In our choice of subgroups, it is obvious that 

5[^Wf/(fc-i)] = _ReTr(at;), (3.22) 
where v is the (i,j) G {(1,2), (2,3), (1,3)} submatrix of X = UW , 



Y Y 

Y Y 



(3.23) 



By using the quaternionic representation of and f , we can bring this into 
a slightly different form, 

^[^(fc)^(fc-i)] = -ReTr(at;) 

= — Re(aofo — cijWj) 

= -^Tr(awt)^ (3.24) 

where w is a real multiple of an SU(2) matrix. Its quaternionic components 
are given by 

Wo = 2 Re Vq 

Wj = -2ReWj, (3.25) 

and explicitly by 

wq = ReXjj + ReXjj 

Wi = —ImXij — ImX ji 

W2 = — ReXjj + ReXji 

W3 = -ImXii + lmXjj. (3.26) 

Therefore, from ( ^.24 ) we can see that (^^) can be obtained by applying 
the Fabricius-Haan algorithm for SU(2), as described above. 



3.2.2 Over relaxation 

The idea of over relaxation is to choose the new variable as far as possible 
from the original one, in an intuitive sense. We again assume that the action 
has the form ( |3.16| ). 
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For a pure SU(2) gauge theory with a Wilson-hke action, where W is 
the sum of elements of the gauge group, it is easy to write down an update 
U' ^ U with the property 

Tt{U'W^) = Tt{UW^), (3.27) 

namely 

This mechanism is not directly transferable to SU(3), because the matrix 
WWW is in general not a real multiple of an SU(3) matrix. But we can 
make microcanonical updates in some SU(2) subgroups of the variable in 
question. In analogy to the heatbath case, we create a new U' by applying 
matrices as in (|3.21|) , 

U' = A^AiAiU. (3.29) 

In order for the action to stay constant, the SU(2) matrices a corresponding 
to the Aj. must fulfill the condition 

) = Tiw^. (3.30) 

This is achieved by setting them to 

a = -l + 2-^^w. (3.31) 

A potentially dangerous operation in this computation is the division by the 
value TtIww"^), which may become very small and cause an overflow. In 
our implementation, when it becomes smaller then 10~^°, we leave the link 
variable at its old value instead of flipping it. This does not invalidate the 
algorithm. 

Finally, the Hybrid Overrelaxation is a combination of the two parts 
described previously [^. In each update, one heatbath step is alternated 
with A'or overrelaxation steps. Nor is a parameter which we may tune to 
reduce autocorrelation times on larger lattices. 

In it has been found that the efficiency of the HOR algorithm depends 



on the order in which a sweep over the lattice processes the links. It is 
advantageous to perform an outer loop over the Lorentz index fi and to 
embed a loop over x within this one. We have only used this variant. 
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3.3 Hybrid Monte Carlo 



The basic idea of the Hybrid Monte Carlo (HMC) is to employ molecular 
dynamics in order to collectively move the configuration through the config- 
uration space. This means, in each step all field variables are updated by 
computing their trajectory through a coupled set of equations of motions. 
It is obvious that such a step goes at least linear with the volume. But of 
course, it also has to be considered that trajectories can only be computed by 
discretizing them, so there are further parameters like the trajectory length 
and the step size. With increasing volume, it may be necessary to adjust 
these parameters, making the cost of a trajectory higher. As a consequence, 
the dependence of the cost on the system size can not be trivially predicted. 

As an alternative to molecular dynamics, one may also consider moving 
fields through the configuration space by other small-step-size mechanisms, 
such as random walks. However, a molecular dynamics trajectory is assumed 
to move more rapidly away from the original configuration, as the single steps 



of the discretized trajectory move into the same "direction" ||44 |. 

For the derivation of the equations of motions, we note that expectation 
values with respect to the action S[U] are equal to the expectation values of 
the respective observables with respect to the Hamiltonian 

H[P,U] = Wp{x,^^,jy + S[U] (3.32) 

Xflj 

with variables P{x, /i, j) from a canonical ensemble. The real-valued variables 
P{x, n^j) are the expansion coefficients of variables P{x, fi) which come from 
the Lie algebra su(3) of SU(3), 

8 

P{x,fi):=J2iXjP{x,fi,j) (3.33) 
i=i 

and which are to be interpreted as conjugate momenta of the link variables 
U{x,fi) G SU(3). Expectation values are defined by 

(F) = J D[P]D[U] expi-H[P, U])F[U], (3.34) 

with the normalization factor 

Z = J D[P]D[U]exp{-H[P,U]), (3.35) 

and obviously factorize into integrals over link variables U and the new fic- 
titious momenta P. The expectation value defined in this way corresponds 
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to a canonical ensemble. For sufficiently large lattices, one should be able 
to approximate this by a microcanonical ensemble, which is characterized 
by a constant energy of all configurations summed over. Provided ergodic- 
ity holds, a chain of configurations in the microcanonical ensemble can be 
generated by solving the canonical equations of motion for the Hamiltonian 
H[P,U]. 

Naturally, for a complex interacting theory, we cannot solve the equations 
of motions exactly, but have to resort to a discretization to compute a trajec- 
tory from a given starting point to an end point giving a new configuration. 
This not only produces errors in the configurations, but also changes the to- 
tal energy, so that we cannot expect to produce the correct ensemble in this 
way. In the Hybrid Monte Carlo, this problem is overcome by an acceptance 
step, which makes the algorithm exact. 

The classical equations of motion for the Hamiltonian and for the variables 
U{x,fi) and P{x,fi,j) are 



P{x,fi,j) 



-D,^jS[U] 
iP{x, fi)U{x, /i) 



(3.36) 



with a derivative -D^ui for the gauge fields. This derivative is defined as 



D^^jf{U{x,fx)) 



d_ 

da 



a=0 



/(e^"^^[/(x,/i) 



(3.37) 



where Aj,j = 1...8 are the Gell-Mann matrices. The form of the sec- 



ond equation in (|3.36| ) guarantees that f/(a;, /x) remains in the SU(3) group. 
For an action linear in the su(3) components of U{x, /i) it is easy to prove 
that with these equations, the energy is conserved. Then Dx^jS\U{x, ix)] = 
S[i\jU{x, ^)] and S[U{x,fi)] = S[U{x,^)] and therefore^ 



(3.38) 



When applying the derivative operator D^^j to the pure gauge action 
Sg[U] only those plaquettes contribute which contain U{x,fi). Using the 
definition of the staples (|3.171) we can write 



D.j.^jSg[U] = — 2R'6Tr i\jU{x, ^)S\x, ^) 
9o 



(3.39) 



^We only display the dependence of 5* on a single link variable here. 
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3.3.1 Detailed balance 

In the following, we prove that the Hybrid Monte Carlo algorithm fulfills 
detailed balance. While the previous sections motivated the algorithm by the 
physics of the system represented by the Hamilton operator iJ, this proof is 
based on weaker prerequisites. We assume that the probability of a transition 
[/'<—[/ is based on three probability distributions: at the beginning of a 
trajectory, momenta are drawn from a Gaussian distribution 

Pm{P) = exp I ^(^' Jn ■ (3-40) 

Following that, the combined set of variables (P, f/) is used as initial condi- 
tion for a deterministic process resulting in a new set (P', f/') according to 
(P', f/') = T(P, [/). The corresponding probability matrix is 

Pt((P', U') ^ (P, V)) = 5((P', U'), r(P, U)). (3.41) 

Finally, the new set of link variables is accepted with the condition 

Pa((P', U') ^ (P, U)) = min{l, exp{-H[P', U'] + H[P, U])}. (3.42) 

with the Hamiltonian from equation ( |3.32| ). As the momentum variables are 
discarded, we have to study the combined transition probability 

P{U' ^U)=J D[P']D[P]Pa{{P',U') ^ (P,f/)) X 

PT{{P',U')^iP,U))PMiP). (3.43) 

As further input, we demand that the computed trajectory is reversible, 

Pt((P', U') ^ (P, U)) = Pt((-P, U) ^ (-P', U')). (3.44) 

Now we can use PM(P)e~'^[^^ = e"'^'^'^' to rearrange the left side of the 
detailed balance condition, 

P{U' ^ f/)e-^[^l 

= J D[P']D[P] min {1, exp{-H[P', U'] + H[P, U])} x 

xPt((P', U') ^ (P, U)) exp{-H[P, U]) 
= J D[P']D[P] min {1, exp(-i7[P, U] + H[P', U'])} x 

xPt((P', U') ^ (P, U)) exp{-H[P', U']) 

= J D[P']D[P] min {1, exp(-iJ[P, U] + H[P', U'])} x 

xPt((P, U) ^ (P', U')) exp(-if[P', U']) 
= P{U ^U')e-^^^'\ (3.45) 
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Thus, we have proven detailed balance for the described algorithm. Note 
that we have not required that the Hamiltonian is used in the computation of 
the trajectory. The trajectory may be computed with different Hamiltonians 
or different discretizations of it, as long as these mechanisms are reversible. 
While the validity of the algorithm is not disturbed by such a choice, it may 
be important for achieving reasonable acceptance rates. 



3.3.2 Computation of the trajectory 

For the computation of a trajectory, we need an integration scheme for the 
equations of motion fulfilling ( |3.44|) . Here we use the leapfrog algorithm. As 



parameters, it has a step size St and the trajectory length r = nSr. 

The integration begins with an update step for the conjugate momenta 
with a step size At = St/2. It is followed by n — 1 update steps with 
At = St/2 for the link variables, alternating with the momentum variables. 
Finally, the link variables are updated with At = St, and the momentum 
variables with At = St/2. 

The explicit formulae for these steps are 

P'(x,/i,j) = P{x,i^,j)-D^^,S[U]At 
U'{x,fi) = exp |^^iAjP(x,/i, j)Ar| (3.46) 

The single steps cause a discretization error of the order St^ each. There- 
fore, the action for the final configuration is expected to differ from the initial 
configuration by an error of order St"^. 



3.3.3 Computation of the exponential 

In the Hybrid Monte Carlo algorithm, each step on a trajectory involves 
for each link the evaluation of the exponential of an element of the gauge 
group algebra. On the one hand, it is desirable to minimize the cost of 
this evaluation. On the other hand, the HMC algorithm is only valid if 
the computation of trajectories fulfills the requirement of reversibility. If 
we approximate the exponential exp(y4) ^ E{A), reversibility demands that 
E{~A)E{A) = 1. 

It is remarkable that this requirement can be easily fulfilled in the case of 
a U(l) group, where A E iR. Here, E{A) = (1 + A/2) /{I - A/2) is a valid 
approximation (considerations about acceptance rates notwithstanding). It 
is straightforward to generalize this to a SU(2) group, whereas the necessary 
inversion of a matrix becomes an obstacle in SU(3). 
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Functions of SU(3) matrices 



Following |^T|, we can transform any polynomial in an SU(3) matrix into a 
second order polynomial. We note that for any 3x3 matrix A, the determi- 
nant can be written as 



1 r 



detA 

6 



(Trv4)^-3(Trv4)(TrA^) + 2(TrA^) . (3.47) 



In the SU(3) case, the argument of the exponential is an element of the Lie 
algebra su(3) with TiA = 0, so we have 

det A = -TtA\ (3.48) 
3 ^ ^ 

According to the Cayley-Hamilton theorem, A is a solution of its character- 
istic equation^] 

det{A-XI) = 

= ^ {[Ti{A - \I)f - 3Tr(A - A/)Tr(A - \lf + 2Tr(A - 

= -X^ + -XTtA^ + -TtA\ (3.49) 
2 3 



Therefore, 

A^ = (-TiA' ]A+{ -TtA^ ) /. (3.50) 



Thus, any power series in A is equivalent with a second order polynomial in A 
with complex coefficients. These coefficients are in turn power series in TrA^ 
and TrA^. From A = —A^ we can conclude that TrA^ = — Tr(74A^) is real, 
and TrA^ = — Tr(A"'''^) is imaginary. More explicitly, we set a = —^TiA^ G R 
and b = ^TrA^ G R. Then an analytic function f{A) can be written as 

f{A) =a2A'^ + aiA + aoI. (3.51) 



The three coefficients 02, ai and are basis- independent and can in principle 
be computed exactly. Practically, computing f{A) = exp(74) in this manner 
is complicated. When eigenvalues of A are approximately degenerate, differ- 
ent formulae have to be used to avoid numerical problems. In particular, on 
a SIMD (single instruction, multiple data) computer such case distinctions 
are inefficient and difficult to implement. 

^By /, we denote the unity matrix. 
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Taylor expansion 



One will in general opt for an approximation of the exponential in the form 
of a Taylor expansion. In this case, reversibility is not obvious, and one has 
to make sure that the approximation is accurate to machine precision in the 
range of arguments which are reasonably expected to occur. 

When f{A) is a Taylor expansion to order N, the coefficients 02, ai and oq 
are polynomials of degree — 2 in a and b. It is easy to see that the number 
of floating point operations is significantly reduced compared to the naive 
computation of the matrix polynomial, with one of the standard techniques 
which are discussed e.g. in 

One way to evaluate the scalar polynomials is to compute them explicitly 
(e.g. with Maple V) to the given order for general arguments and implement 
Horner schemes for them in the simulation program. It is however advanta- 
geous for the implementation of the above algorithm to take into account its 
recursive nature. To see this, we define a series of coefficient sets by 

Ak N~3 Ak 

k=0 k=0 



k=o k\ 



(3.52) 



Obviously we have to start with the triplet 

«2 -jv!'"! "(iV-l)!'''° "(iV-2)!' ^^-^^^ 
and we are finally interested in 

^2 = Ci2 \ ai = ai'\ ao = cio^^ (3.54) 

With the help of 

= -aA - ibi, (3.55) 

the recursion relation for the af'^ comes out as 

(fc) (fc+i) 
a2 = al 

(fc) _ (k+l) (k+l) 
Qj-^ — QjQ CI d'^ 

ao^ = ^^^2)! -^^Q?^^^- (3-56) 

This recursion scheme can be implemented as a loop k = {N — 1) ... 2. In 
such an implementation, one can conveniently adjust the order to which 
the Taylor expansion is used. 
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Accuracy 



The error by approximating the exponential by a truncated Taylor expansion 
of a 3 X 3 matrix can be estimated by 



exp A 



N Ak 

y — 

k=0 



< 



max 



(A^ + 1)! o<s<i 



^A^+iexp(^+i)(As^ 



(3.57) 



Using the Cauchy-Schwarz inequality and the fact that for a unitary matrix 
A, \\A\\2 = 1, one gets an estimate for the error 



N 

exp A — 

A:=0 

3 



Id 



< 



< 



N.ll^^^^lh max II exp(As)||2 
(A + 1)! " " o<s<i " 



(A + 1)! 



Af+l 
2 



(3.58) 



In the leapfrog algorithm, the relevant values of A come from a statistical 
ensemble. We have 



A = t5Tj2X,P„ 



(3.59) 



where St is the step size, {Xj,j = 1 . . .8} are the Gell-Mann matrices, and 
Pj are the normally distributed momenta. Obviously, the norm of A is linear 
in 6t. Numerically, one can see that the norm very rarely (^ 10^^) exceeds 
76t. 

If we take this as a realistic bound on ||v4||2 and demand that the error in 
the result be smaller than 10~^, we obtain an upper bound on St, depending 
on A, 



-{76t 



10' 



St 



(A+1)! 
_ 1 /10-S(A + 1)!\^ 



(3.60) 



In table values for the order A necessary for some example values of St 
are listed. 
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N 
6t 



4 6 8 10 12 14 16 

0.0075 0.0297 0.0677 0.1189 0.1804 0.2498 0.3252 



Table 3.1: Upper bound for 6t depending on the order N of the Taylor 
expansion for the exponential. 



3.3.4 Tests 

We have tested our implementation of the Hybrid Monte Carlo algorithm in 
several ways. 

Reversibility 

For guaranteeing the correctness of the molecular dynamics evolution, one 
should check whether trajectories are truly reversible. It has been seen pre- 



viously in [Q that the classical equations of motions in the Hybrid Monte 
Carlo are chaotic in nature. This means, roundoff errors introduced at any 
place in the computation are amplified exponentially with the Monte Carlo 
time, even for 6t 0. It is not evident however whether this phenomenon 
plays a role for our choice of parameters. 



A remarkable example in [Q] demonstrates that reversibility violations do 



not necessarily cause the acceptance rate to be low. As the leapfrog integra- 
tion scheme has a preference for conserving energy even when the computed 
trajectory deviates much from the true trajectory, the energy difference 6H 
may be small although there are reversibility violations in the configuration 
itself and in observables. 

To check reversibility explicitly, we compute the trajectory once, change 
the sign of all momentum variables, and then compute a new trajectory with 
the same length and step size. The resulting configuration should be identical 
with the starting configuration, apart from unavoidable roundoff errors with 
a magnitude comparable to the machine precision . A measure for the error 
is the quantity 

lj2{\\Uix,fi)-Uix,fi)h), (3.61) 
y XfJ, 

where U is the endpoint configuration. Obviously, this observable is a more 
direct measure than typical observables only involving a subset of all degrees 
of freedom. With several initial configurations and parameters r and St 
used in our study, we could see that this error measure is of the order of 
some 10~^^ on a PC with 64 bit arithmetics. On the APEIOO machine with 
32 bit arithmetics, it turns out to be of the order of some 10~^. 
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First, this indicates that the trajectory computation is indeed correct 
apart from roundoff errors. Second, it shows that at least in the range of our 
parameters r and 5r, there is no sign that errors grow exponentially with 
the number of steps on the trajectory. For larger lattices, we expect that the 
number of steps per trajectory has to be increased. Therefore, at some point 
the integration should become unstable. A detailed analysis can be found in 
55|. 

As an additional check, we have also investigated which effect rounding 
errors have on the measured coupling. To this end, we have implemented a 
version of the program that artificially adds noise to the force in the update 
of the momentum variables. Precisely, in each computation of a P{x,n,j) 
proposal, we substitute 

D,,,S[U] ^ (1 + n)D,,^S[U]. (3.62) 

Here, n is drawn from a Gaussian distribution with width a. The parameter 
a can be chosen to model different magnitudes of errors. 



a 




0.100000 


1.0251(147) 


0.010000 


1.0203(11) 


0.001000 


1.0203(10) 


l.Oe-7 


1.0218(4) 


l.Oe-15 


1.0218(7) 



Table 3.2: Results for the coupling for different values of artificial error mag- 
nitudes a. 



On an APEIOO computer with single precision, we have performed sim- 
ulations at (3 = 9.596 on a L/a = 4 lattice with a = 10"^ 10-^ 10"^ 10"^ 
Table p.2| shows the obtained results. In this table, we have also inserted a 
value a = 10^^ for the undisturbed simulation, assuming that this is the 
typical magnitude of errors caused by single precision arithmetics. This 
simulation result is averaged over 270000 iterations, while the other runs 
consisted of only 50000 — 60000 iterations. As comparison, there is also a 
value of the coupling for a = 10^^^ obtained from a simulation on a PC with 
double precision. A simulation with the HOR algorithms yields the result 
f = 1.02175(44). 

Figure ^]T] shows the results for g'"^ in a plot where the ordinate is log- 
arithmic in a. This illustrates that the coupling is very insensitive against 
the Gaussian distributed errors in the force. Even at cr = 0.01, the coupling 
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Figure 3.1: Coupling in dependence of the artificial error a. 



drifts only on a permille level. At even larger a, the acceptance rate collapses 
and the error grows, so that systematic effects can not be measured. 

This investigation increases our trust in the feasibility of the Hybrid 
Monte Carlo on a single precision computer. Nevertheless, we note that 
it is not clear that our model of Gaussian rounding errors is realistic; differ- 
ent asymmetric models may introduce larger biases. We have also restricted 
ourselves to a specific L/a and (3. 

In addition, we emphasize that in global sums in the APEIOO code, a 
double precision data type implemented in software is used [^6|, |5^. Thus, 



on large lattices with many terms the loss of precision due to cancellations 
is significantly reduced. 

Rounding errors do not only play a role for the reversibility of the HMC 
For the implementation of any algorithm, the validity of the simulation re- 
quires that all link variables remain in the SU(3) group. Due to rounding 
errors, matrices move away from this group in the course of typically thou- 
sands of iterations in a simulation. To avoid this, from time to time it is 
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necessary to "project" them back. Assuming that a variable has the value 
V = u{l + e), where e is the deviation from the true SU(3) matrix u, one can 
construct a corrected variable w with 

X = 2^{'^ ~ '^^'^) 

w = X ^1 — -Imdet . (3.63) 

This formulae are based on the assumption that when e is in the order of 
typical roundoff errors of the machine, is negligible. Then 

v^v ^ 1 + e + 
detv ^ 1 + Tre, (3.64) 



further 



/ 6-6f 

X ~ M I 1 H — 



e-et 



detx ^ 1 + Tr , (3.65) 

2 



and finally 



w ^ M IH 1 - Tr 

\ 2 M 6 



w^'w ~ 1 

detw ^ 1. (3.66) 
All the preceding equations are meant to be valid up to order e^. 

Control variable 

Each step in the leapfrog is area preserving This means, if we denote 
with (P', U') the configuration going into the acceptance step, 

D[P']D[U'] = D[P]D[U]. (3.67) 

Consequently, 

D[P'][DU'] exp{-H[P',U']) 

= J D[P]D[U] exp{-H[P, U] - {H[P\ U'] - H[P, U])). (3.68) 
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If we measure E := exp{—H[P' , U'] + H[P, U]) as an observable, we expect 
exactly 

{E) = 1. (3.69) 

This can be used as a test for the correctness of the leapfrog implementation 
and its reversibility. 



3.4 Local Hybrid Monte Carlo 

We also study here a local version of the Hybrid Monte Carlo which was 



apparently first discussed in As this algorithm does not collectively 

move all degrees of freedom through the phase space, one naturally does not 
expect it to be suitable for non-local actions. For local theories, it is unlikely 
to compete with finite step size algorithms, like Hybrid Overrelaxation. But 
in contrast to HOR, it is suitable for actions which are not linear in the link 
variables. It is therefore a candidate for the update of improved bermions, 
which we will investigate in chapter |. 

In the global Hybrid Monte Carlo, in each step on a trajectory the global 
candidate configuration of new momentum and link variables are updated. In 
the Local Hybrid Monte Carlo (LHMC), a sweep over the lattice is performed. 
For each link variable, a trajectory is computed with all other link variables 
kept fixed. 

When applied to a pure gauge theory, there are some advantages in this 
local approach: 



• Since for each trajectory only a single link variable is changed, the 
discretization error is much smaller than for the global case. Therefore, 
larger step sizes should be possible than for the HMC. 

• Computing the "force" Dx^jS[U] in each step is quite cheap because 
the sum of staples stays constant over a whole trajectory and has to 
be computed only once per trajectory. Therefore, additional steps are 
quite cheap. In this study, this advantage has turned out to be of minor 
importance, as the optimal parameter set has t/6t ^ 2 — 3. 

It is evident that the LHMC is not a suitable algorithm for the simu- 
lation of pseudofermions. With a fermionic action, the computation of the 
force goes with the volume of the lattice, which means that the cost of the 
algorithm is quadratic in the volume. 

We also mention here that for a pure Wilson action, the leapfrog algorithm 
for calculating a trajectory can be replaced by an exact integration of the 
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equations of motion [Q. The method in this reference makes use of the fact 
that in certain subgroups of SU(3) the action takes the form of the energy of 
a pendulum. 



3.5 Results 

Our aim is to represent an efficiency measure in a way that allows a machine- 
independent comparison of different algorithms. Furthermore, we want to 
investigate the cost of computing the observable g, which is directly relevant 
for the computation of the running coupling. Therefore, we define a measure 
'S'cost such that in order to compute g'"^ at 1% accuracy, the equivalent in 
complexity of S'cost computations of all staples is required. Explicitly, 

S'cost = (number of computations of all staples) 
100 ■ (error of g^^) 



X 



(3.70) 



This measure already excludes a trivial volume factor. For the error of the 
cost, we estimate 

(^S'cost = S'cost ) (3-71) 

Tint 

which implies that we neglect the error of the autocorrelation function at 
zero, r(0), as explained in appendix 0. 

For the HMC algorithm, the number of staples computations is the num- 
ber of trajectories times n + 1, when nSr = r is the trajectory length. 

For the LHMC algorithm, we have already noted that further additional 
steps on a trajectory are cheaper than the initial one. In our experience, 
an appropriate estimate for an equivalent of staples computations on our 
computers is 1.6 -|- 0.4n. For example, a trajectory with n = 6 is only twice 
as expensive as one with n = 1. For the Hybrid Overrelaxation algorithm, 
we use 1 + Not- as an equivalent for the staples computations. 

We have performed this study at constant physics in the sense of the 
Schrodinger functional. This means, for each lattice size L/a, the bare pa- 
rameter P is tuned such that the measured coupling of the system matches 
g^ = 2.100. We have considered lattice sizes L/a = 4, 6, 8, 10. Except for the 
L/a = 4 case, the tuned (3 values were taken from [^]. Table ^]B| shows the 
values we have used. 

Most of the simulations were done on APEIOO machines with 8 nodes, 
while the largest lattices were simulated on machines with 128 nodes. For 
the Hybrid Monte Carlo algorithms, we have used trajectory lengths of 0.5, 
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L/a 


/5 


4 


6.7796 


6 


7.1214 


8 


7.3632 


10 


7.5525 



Table 3.3: Tuned (3 values for various L/a belonging to g'^ = 2.100. 



1.0 and 2.0 and varied the step size 6t in a range with reasonable acceptance 
rates Pace- In addition to the coupling, we have also measured the acceptance 
rates. This was motivated by the common folklore that optimal efficiency 
can be achieved with an acceptance of about 70%. It thus makes sense to 
parametrize plots of the cost also by the acceptance rate. For the Hybrid 
Overrelaxation algorithm, we have varied A'^or from (corresponding to a 
pure heatbath) to 3. 

Table p.4| shows the cost of the HOR algorithm. The corresponding plot 



can be found in figure 3.2. For all lattice sizes L/a, the optimal choice of A^o 



turns out to be 1. Table p.5| , in which we list the autocorrelation times for 
the same runs, indicates the reason for this: for A'^or = 1, there are almost no 
autocorrelations. If the number of overrelaxation sweeps per measurement 
is increased, the computational cost increases, while Tint in units of updates 
cannot decrease further. 



In table |3.6| , the results for the HMC algorithm are listed. The figures |^ 
and p.4| contain plots of these data for L/a = Q, L/a = 8 respectively. Both 
plots show the data with constant trajectory length connected with dashed 
lines. 

In all cases, the optimal trajectory length is 1. One can also see that for 
each trajectory length, the minimal cost is achieved at an acceptance rate 
of roughly 70 %. This is consistent with conventional wisdom. When L/a 
is increased, the step size has to be scaled down in order to gain optimal 



performance. In it is argued that for the leapfrog discretization, the 



average acceptance scales as 

(Pace) =exp{-(5rL/a)^}, (3.72) 

and therefore the step size must be changed with 6t oc a/L in order to 
keep the acceptance rate fixed. This seems to be roughly the case in our 
simulations. 



Table 3.7 contains our results for the cost and acceptance rates of the 



LHMC algorithm. These data are illustrated in figure |3.5| for L/a = 6 and 
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L/a iVor = iVor = 1 iVor = 2 = 3 Ar„, ^ 4 A^or = 5 TVp, = 6 

4 5.03(1) 3.41(5) 4.07(6) 4.98(7) 

5 9.20(26) 5.68(10) 6.42(9) 7.61(11) 

6 13.8(4) 8.71(16) 9.31(2) 11.0(2) 12.4(2) 14.4(2) 16.3(3) 

8 28(1) 16.5(3) 17.9(3) 19.2(3) 

10 46.3(2) 25.4(7) 26.4(6) 30.1(7) 



Table 3.4: ^cost/lOOO for the HOR algorithm for different L/a and N, 



L/a Nor = A^or = 1 A^or = 2 N^^ = 3 A^pr = 4 N^, ^ 5 Np^ = 6 

4 2.10(5) 0.72(1) 0.57(1) 0.52(1) 

5 2.74(8) 0.86(1) 0.65(1) 0.58(1) 

6 3.21(8) 1.02(2) 0.72(1) 0.64(1) 0.57(1) 0.55(1) 0.53(1) 
8 4.5(2) 1.27(2) 0.92(2) 0.75(1) 

10 5.4(2) 1.47(4) 1.02(2) 0.86(2) 



Table 3.5: Tint for the HOR algorithm for different L/a and N 
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Figure 3.2: 5'cost for the HOR algorithm. 



47 



L/a 




T / 6t 


acceptance in % 


c 

*-^cost 




4 


0.5 


8 


50.5(2) 


6.71(18) 


• 10^ 


4 


0.5 


12 


77.1(1) 


5.97(12) 


• 10^ 


4 


0.5 


15 


85.1(1) 


6.66(16) 


■ 10"^ 


4 


0.5 


20 


91.5(1) 


7.72(14) 


■ 10^ 


4 


1.0 


20 


62.8(2) 


5.55(11) 


• 10^ 


4 


1.0 


30 


83.2(1) 


5.07(8) 


• 10^ 


4 


1.0 


40 


90.2(1) 


5.77(8) 


• 10^ 


4 


1.0 


60 


95.7(1) 


7.84(14) 


■ 10^ 


4 


2.0 


30 


38.3(2) 


1.85(4) 


■ 10^ 


4 


2.0 


60 


83.0(1) 


1.09(1) 


• 10^ 


4 


2.0 


100 


93.9(1) 


1.46(2) 


• 10^ 


6 


0.5 


10 


30.5(3) 


3 05('18'l 


• 10^ 


6 


0.5 


15 


64.8(2) 


2.34(9) 


■ 10^ 


6 


0.5 


30 


91.0(1) 


3.17(10) 


■ 10^ 


6 


0.5 


50 


96.8(1) 


5.01(16) 


■ 10^ 


6 


1.0 


20 


22.8(9) 


4.54(22) 


• 10^ 


6 


1.0 


30 


59.1(3) 


1.96(5) 


• 10^ 


6 


1.0 


40 


76.3(2) 


1.87(4) 


■ 10^ 


6 


1.0 


60 


89.3(2) 


2.13(4) 


■ 10^ 


6 


2.0 


60 


59.4(2) 


3.22(8) 


• 10^ 


6 


2.0 


80 


76.4(2) 


2.90(6) 


• 10^ 


6 


2.0 


120 


89.2(1) 


3.20(6) 


• 10^ 


6 


2.0 


160 


94.1(1) 


3.92(7) 


■ 10^ 


8 


0.5 


15 


39.6(3) 


7.89(40) 


■ 10^ 


8 


0.5 


20 


63.6(2) 


5.25(23) 


■ 10^ 


8 


0.5 


30 


83.0(2) 


6.16(35) 


• 10^ 


8 


0.5 


50 


94.1(2) 


8.63(45) 


• 10^ 


8 


1.0 


30 


30.7(3) 


8.64(41) 


■ 10^ 


8 


1.0 


40 


56.4(2) 


5.33(16) 


■ 10^ 


8 


1.0 


60 


79.9(3) 


4.27(13) 


• 10^ 


8 


1.0 


80 


89.3(4) 


4.91(14) 


• 10^ 


8 


2.0 


80 


57.5(4) 


7.15(28) 


• 10^ 


8 


2.0 


120 


80.4(2) 


6.19(18) 


■ 10^ 


8 


2.0 


160 


88.5(2) 


7.11(21) 


• 10^ 



Table 3.6: S'cost for the HMC algorithm for diSerent L/a, r and Sr. 
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Figure 3.3: 5'cost for global Hybrid Monte Carlo at L/a — 6. For each data 
point, the corresponding acceptance is shown. 
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Figure 3.4: 5'cost for global Hybrid Monte Carlo at L/a — 8. 
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L/a 


r 




acceptance in % 


'S'cost 




4 


0.5 


2 


85.3 


7.39(16) 


• 10^ 


4 


0.5 


3 


93.7 


9.08(18) 


•10^ 


4 


0.5 


5 


97.8 


1.37(3) 


• 10^ 


4 


1.0 


3 


67.6 


6.11(30) 


■ 10^ 


4 


1.0 


4 


82.6 


3.41(11) 


■ 10^ 


4 


1.0 


5 


89.5 


3.19(9) 


• 10^ 


4 


1.0 


10 


97.6 


4.33(10) 


• 10^ 


4 


2.0 


6 


73.9 


4.00(11) 


• 10^ 


4 


2.0 


10 


93.5 


2.92(6) 


■ 10^ 


4 


2.0 


20 


98.2 


5.70(11) 


■ 10^ 


4 


4.0 


20 


92.0 


2.36(9) 


■ 10^ 


6 


0.5 


2 


85.1 


2.46(7) 


• 10^ 


6 


0.5 


3 


93.7 


2.22(5) 


• 10^ 


6 


0.5 


5 


97.8 


2.74(6) 


■ 10^ 


6 


0.5 


10 


99.5 


4.14(10) 


■ 10^ 


6 


1.0 


4 


80.7 


1.15(6) 


■ 10^ 


6 


1.0 


5 


88.2 


8.60(34) 


• 10^ 


6 


1.0 


10 


97.3 


8.74(28) 


• 10^ 


6 


2.0 


6 


71.7 


1.04(4) 


• 10-^ 


6 


2.0 


10 


94.7 


4.62(11) 


■ 10^ 


6 


2.0 


20 


98.6 


7.32(17) 


• 10^ 


8 


0.5 


2 


85.1 


4.94(18) 


• 10^ 


8 


0.5 


5 


97.7 


4.96(15) 


• 10^ 


8 


0.5 


10 


99.4 


7.71(23) 


■ 10^ 


8 


1.0 


4 


79.7 


2.78(21) 


■ 10^ 


8 


1.0 


5 


87.5 


2.16(13) 


• 10^ 


8 


1.0 


10 


97.1 


2.09(11) 


• 10^ 


8 


1.0 


20 


99.3 


3.30(16) 


• 10^ 


8 


2.0 


8 


89.5 


9.82(37) 


■ 10^ 


8 


2.0 


10 


94.6 


9.40(31) 


■ 10^ 


8 


2.0 


20 


98.8 


1.44(5) 


• 10^ 



Table 3.7: S^osx. for the LHMC algorithm for different L/a, r and St. 
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Figure 3.6: 5'cost for Local Hybrid Monte Carlo at L/a 
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Figure 3.7: Scaling behavior of r(0) from runs with the HOR algorithm, in 
a double-logarithmic plot. 
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Figure 3.8: Optimal 5'cost for the different algorithms in a double-logarithmic 
representation. For comparison, dotted lines which correspond to exponents 
z = 1,2 are displayed. The points denoted with HB correspond to a pure 
heatbath. 
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3]6| for L/a = 8. For all L/a, the optimal trajectory length is r = 0.5. It is 
remarkable that the cost for r = 2.0 is smaller than for r = 1.0. In order 
to check the behavior for even longer trajectories, we have performed runs 
with lower statistics and found that the cost increases again. This may be an 
indication that performance oscillates in the trajectory length. The reason for 
this could be that the phase space that can be exhausted by a local trajectory 
is quite small, because a link variable is restricted to the SU(3) manifold. A 
"good" trajectory is one that is equivalent with an overrelaxation update. 
When the trajectory length is further increased, it moves on cycles on the 
sphere. 

Compared to global Hybrid Monte Carlo, step sizes can be made much 
larger, as expected. Optimal acceptance rates are also much larger. 

In the following, we investigate how the performance of these algorithms 
scales with L/a. In lattice field theories, the continuum limit is defined near 
second order phase transitions in the bare parameter space. Simulations 
in this area suffer from critical slowing down, i.e. the autocorrelation time 
increases rapidly as the continuum is approached. One expects that the 



autocorrelation time diverges as I^T 



rint~min(L,0^ (3.73) 

where ^ is the correlation length and z is an exponent specific to the algorithm 
used. It is not clear that the definitions of z in the regimes L > ^ and L < ^ 
are equivalent |6^. For simple local algorithms like the heatbath, an intuitive 



understanding is that modifications of variables are transported around by 
diffusive dynamics, which suggests a critical dynamical exponent z = 2 near 
the critical point. As the main cost of QCD simulations arises from the 
largest lattices, z is an important parameter to characterize algorithms. 



L/a r(0) 

4 0.0311 

5 0.0378 

6 0.0491 
8 0.0735 

10 0.0982 



Table 3.8: r(0) for runs with the HOR algorithm with different L/a. We 
have not determined errors for these data. 



According to ( |C.8|) , the error of an observable A is the product of r(0) 
and the autocorrelation time. The former is a property of the statistical 
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ensemble whereas the latter is a property of the simulation algorithm. Both 
have their separate scale dependence. In table |3.8| we have listed r(0) for the 
HOR algorithm, for which our data goes to L/a = 10. Figure p.7| shows a 
logarithmic plot. For a fit of log(r(0)) against log(L/a), we drop the value for 
L/a = 4 because the scaling behavior is only reached asymptotically. From 
the fit, we find approximately 

r(0) oc (L/a)'-'' . (3.74) 

Therefore, we expect our cost measure to scale like 

S-cost oc (L/a)^•''+^ (3.75) 

Figure shows the cost of the studied algorithms at different lattice sizes 



in comparison. For each L/a, we have plotted the data with optimally tuned 
parameters. The figure also indicates exponents z = 1, 2 by dashed lines. 
The Hybrid Overrelaxation algorithm almost reaches z = 1, while the other 
algorithms scale slightly worse. 

At L/a = 8, the observed ratios in ^cost for tuned parameters are 1 : 3 
: 26 for HOR : LHMC : HMC. This illustrates the overhead of HMC even 
before dynamical quarks are included. 
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Chapter 4 
Bermions 



In this chapter, we are going to investigate the Schrodinger functional cou- 
phng in a Yang-Mills theory coupled to a bosonic spinor field. Formally, it 
corresponds to setting the number of flavors in the fermionic partition func- 
tion to A^f = —2. In comparison to the full QCD case, this theory has a local 
interaction and is therefore much cheaper to simulate. 

In the literature, the term bermion model has been coined for this theory. 
It was introduced for the purpose of studying unquenching effects without 
immediately moving to the full QCD theory 0, The emphasis in 
these articles lies on the extrapolation of results from negative to positive 
flavor numbers. In perturbation theory, this can surely be justifled. There, 
the flavor number is just a parameter in the Feynman graphs, and increas- 
ing powers of A^f appear with increasing loop order. Therefore, quantities 
calculated to flnite order are polynomials in A^f. 

Non-perturbatively, the A^f dependence is not straight-forward. An ex- 
trapolation cannot be done in a reliable way and with errors under control. 
Here, we are not trying to obtain quantitative information for the full theory 
from the bermion theory. The model serves as tool for studying improvement 
and the approach to the continuum limit. 



The Schrodinger functional, as the partition function of the system, is an 
integral over all gauge and quark flelds which fulflU the given boundary con- 
ditions. After integrating out the quark flelds, it is given by 



4.1 Model 



/ 



D[U]D[^]D[ij]e 



SlU,ip,'4>] 




(4.1) 
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with the fermion matrix 



M = 2K{D + 6D + mo), k = (8 + 2amo)"^ (4.2) 
For A^f = —2, the determinant can be integrated in, 

Z = I D[U]D[(P^]D[(P]e~^<^''^\ (4.3) 

Here, the fields (pix) are located on the lattice sites and carry Dirac and color 
indices. However, the components are complex numbers and can therefore 
be regarded as boson fields. The bermion action St, is given by 

S4<P,<p\U] = J2\M<p{x)\'. (4.4) 

X 

In order to completely define the action, we have to fix the improvement 
coefficients Ct, Ct and Csw For the first two, the choice is easy: these coef- 
ficients are only known in perturbation theory, and perturbative results are 
computed with general A^f. Therefore, we can take the formulae (|2.44|) and 



( |2.54| ) and insert A^f = —2 and the value go used in the respective simulation. 

For the Sheikoleslami-Wohlert coefficient Cgw, the situation is more com- 
plicated. It has been determined non-perturbatively for the range of bare 



couplings < go < 1 interesting for the ALPHA programme for A^f = ||3^ 
and A^f = +2 The results can be represented as smooth functions of the 
bare coupling. 



1 - 0.656^^ - 0.152c/^ - 0.054^g 
^f=o ~ 1 - 0.922(72 



_ 1 - 0.454^g - 0.175^g + 0.012^6 + 0.045^« 
CsM\r,,=2 - 1 - 0.720^7g ■ ^ 

These interpolation formulae are arranged such that they go over to the 



A'f-independent 1-loop result for (70 —> |^| 



CsM = 1 + 0.2659(l)(7o' + O(^o^). (4.6) 

For the purpose of this work, performing the method proposed in these 
papers also for A^f = —2 would be a large undertaking. We have therefore 
chosen to extrapolate the non-perturbative data from A^f = and A^f = +2 
linearly in A^f. Only for the most critical parameters used in our simulations, 
we have explicitly calculated Csw with the method utilized in the papers cited 
above and checked that our extrapolation matches the obtained values with 
good accuracy. This will be explained in more detail in section O. 
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4.2 Simulation algorithm 



In |6^, the bermion model was investigated without improvement. There, it 



was straight-forward to find a suitable algorithm: 

• The gauge fields are updated with a Hybrid Overrelaxation algorithm 
as described in section |3^ . 

• The boson fields are updated with an overrelaxation algorithm. 

The important property of the unimproved action that made this possible 
is that it is linear in each link variable. We found it practically impossible 
to generalize these finite step size methods to the improved case, where the 
action depends quadratically on the individual link variables. 

From our experience with the Local Hybrid Monte Carlo algorithm, as 



discussed in section this was an alternative candidate for the update of 
the gauge field. However, the inclusion of the clover term defeats its main 
advantage. In that case, the force D^^^jS\U] depends on the link U{x,^) in 
a non-trivial way and has to be recomputed in each step on a trajectory. 
We therefore expect this algorithm to be relatively expensive for improved 
bermions. 

After some experimentation, we decided to update the gauge field by 
applying local overrelaxation with respect to the action without clover term. 
An acceptance step with the local action difference with respect to the full 
action then corrects for the error made. The assumption behind this method 
is that the clover term only represents a small correction to the unimproved 
action. Therefore, the acceptance rate should be reasonably high. Our results 
confirm this assumption. 

As the proposals for the gauge field update are fully deterministic and set 
up in a special way, ergodicity may be severely hampered when the bermion 
fields are also updated with an overrelaxation method. Therefore, we have 
changed the boson field update to a global heatbath. 

4.2.1 Gauge fields 



In order to apply the overrelaxation method described in section |3.2.2 



we 



have to isolate a part of the full action that is of the form ( p.l6| ). For the 
derivation of such a contribution, we utilize the operator M„ defined by 



M^0(x) = --|A^(1 - 7^)f/(x,/i)0(x + afi) + 

+ 1p)U{x - ajl,^)(j){x - ajl)y (4.7) 
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Furthermore, we introduce = M^^i^pj-oved — M^, which is the contribution 
to the bermion matrix from the directions orthogonal to fi. Then the boson 
action can be written as 

S'b = I Mu„improved0(a;) ^ + | Munimproved0(a; + O/i) | ^ H 

= 2Re(M^0(x),M^0(x)) 

+2Re(M^0(x + a/i),M^0(x + afi)) H 

= Re|(M^0)"f(x)A^(l -7^)f/(x,/i)0(x + a/i) 



^ Re ^ [(M^0)^(x)] ^^Uo^p 4>{x + a/i) 



(M^0)t(x + a/i)(l + 7^) 



A/3 



= -2KReTr{[/(x,/i)^"^(x,/i)} + --- (4.8) 

Here we have used A to denote the Dirac index and a, /? to denote color 
indices. The SU(3) matrix ^ can be computed as the dyadic product of 
matrices v and w, = •^^('"^^'^)/3o- These matrices in turn are given by 

w = (1 +7^)M^0(x + a/i) + -(1 -7^)0(x + a/i). (4.9) 

a 

The action we use to generate a proposal for a link variable now is 

^prop = -ReTr{f/(x,/i)0(x,/i)} (4.10) 

with 

V{x, /i) = ^S{x, /i) + 2/€ e(x, /i), (4.11) 

where S{x,fi) is the staples matrix as defined in ( ^.17[ ). 

Note that in the action Sp^op we have only considered terms from the 
unimproved bermion action. In principle, we could have included the contri- 
bution to S'b which stems from the improvement coefficient Ct and which is 
linear in U{x,fi). But this term is very small, and it turns out that varying 
Ct does not change the acceptance rate of this algorithm significantly. On the 
other hand, including it in the proposal action would complicate the imple- 
mentation of the algorithm such that optimization blocks are broken and the 
program is slowed down. It is therefore probably more efficient to include 
the tiny Ct contribution into the acceptance step. 
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The SU(3) overrelaxation mechanism described in section p.2| , in which 
three SU(2) subgroups are "flipped" subsequently, is not set up symmetri- 
cally. This means, the probability of going from one value to another is not 
the same as the probability of the reverse transition. 

This problem can be cured by choosing the order of subgroups randomly, 
i.e. with a probability of 1/2 for the sequences 1,2,3 and 3,2,1. Thus, 
starting from a variable U we generate a proposal U' with the probability 

Pp{U' ^U) = ]^[5{U' - A^A2AiU) + 6{U' - AiA2A^U)]. (4.12) 

The application of these matrices is discussed after ( |3.29| ). The proposal is 
then accepted with the probability 

PA(f/' ^U)= min{l,exp(-^[0,0t,t/'] + 0+, f/])}. (4.13) 

We can show that the combined probability 

P(U' ^U) = Pa{U' ^ U)Pp{U' ^ U) (4.14) 

fulfills detailed balance. Since Pp{U' ^ U) = Pp{U ^ U'), we transform 

P{U' ^ U) exp(-5[f/]) 
= Pp{U' ^ U) min{exp(-S[f/]), exp(-S[f/'])} 

= Pp{U' ^U)Pp^{U ^U')eM-S\U']) 

= P{U ^ U')expi-S[U]). (4.15) 

We continue with a discussion of the implementation of this method. In 
the acceptance step, the change of the action for the proposed change of a 
link variable is needed. For the gauge part of the action, the difference 

Sg[U']-Sg[U] = -^ReTi {[U'{x, fi) -U{x,iJ,)]s\x,iJ,)} (4.16) 

can easily be obtained. The analogous difference for the bermion contribution 
is more complicated. A change of the variable U{x, /i) affects the clover leafs 
centered around fourteen lattice sites, 

F = {x, X + a/t, X ± az>, X + a/t ± az>}, ^ fi- (4-17) 

Hence, a naive computation of S'b[t/'] — Si,[U] implies fourteen applications 
of the Dirac operator for each local step, which would make the acceptance 
step very expensive. 
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Instead, we introduce an auxiliary field ip = M[U](j) which is always kept 
up to date during a sweep over the lattice. At the beginning of a sweep, 
the Dirac operator is applied globally. Then, for each proposed local change 
U'{x,fi) U{x,fi), proposals ip'{z) ^ ipi^z) are computed for the fourteen 
affected sites z. The bermion action difference used in the acceptance step is 



S4<p, <p\ u'] - S4<p, 0t, u] = E {l^'(^)r - imi'}- 



(4.18) 



The proposals for the auxiliary field are accepted or rejected together with 
the proposal of the link variable. With this method, in each local step only 
the difference M[U'](j){z) — M[U](j){z) has to be calculated, which involves 
fewer floating point operations than the computation of M(f){z). 

In order to make this explicit, we note that the action depends on a link 
variable f/(x, /x) through either the hopping or the clover term. We split 
M = Ml + M2 into a term Mi which is diagonal in coordinate space and a 
term M2 which contains nearest-neighbor contributions. Then the proposals 
are given by 

^P'iz) = ^(z) + A«0(z) + AiX^), (4.19) 
where we have to compute 



Mi[U']<P{z) ~ Mi[U]<P{z) 
M2[U']<P{z) - M2[U]<P{z) 



(4.20) 



The Hopping term affects the sites x and x + a/i. 



X] 



^fl(t){x + afi) 



a 

K 



7^) U' {x, fi) — U{x, fi) (j){x + afi) 



-A* (1 + 7;.) U'\x,fi)-U\x,fi) 0(x). (4.21) 



The clover term affects all z ^Y. At z = x + ajj, + aO we get for example 



Ad) 



X 



[x + afi + av) = — KCswO-ui/ 
8a 



UK 



X 



au 



/i)[/t( 



X, Z/ X 



U'{x, fi) — U {x, fi) U{x + afi, z/)0(a; + afi + aP). 



(4.22) 



Similar four-link products with (j){z) insertions are obtained at the other sites. 

Let us summarize our method. The update of the gauge fields begins with 
a global application of M in order to compute the field ip. Then, a sweep 
over the whole lattice is performed. In each local step, we do the following: 

• Compute the SU(3) matrix V^(x, /i) given by ( [4.111 ). 
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Compute a proposal U'{x,fi) for the local link variable by an overre- 
laxation step, which leaves the action (|4.10| ) constant. 

Compute ip'{z) for z following ( [4.19| ). 

Compute S[(j), f/'] _ S[(j), 0t, U] using ( CT) and (|4:i8| ). 



Accept or reject the U and ^ proposals with the probability ([4.13|) . 



4.2.2 Boson fields 

In a given gauge field background, one can easily find an update algorithm 
for the boson fields. The action is quadratic in 0, so the needed distribution 
is 

P(0) oc exp{- ;^(M0(x), M(P{x))}. (4.23) 

X 

This can be achieved by generating a field ( with the distribution 

P(C)ocexp{-^(C(x),C(x))} (4.24) 



and then computing (f)' = M ^C,. Equation (f4.24|) is a Gaussian distribu- 



tion with variance 2 for the real and imaginary components of 0(x)^q,. An 
algorithm for generating Gaussian numbers on the APEIOO is described in 
appendix 

The direct usage of this heatbath method implies that one has to solve 
the equation C, = Mcf)' with full precision, which can be quite costly. If one 
reduces the precision of the solver, the correct distribution ( |4.23| ) gets dis- 
turbed in an uncontrolled way. By introducing additional degrees of freedom, 
one can get around this problem. Following p^, we introduce a field x with 



the same indices as the boson field 0, distributed like 

P{X) = exp |- 5: \x{x) - M0(x)|2| . (4.25) 

As / DxP{x) = const is independent of (p, one can then simulate the com- 
bined distribution 

P{(j),x) = exp |-^[|M0(x)p - \x{x) - M0(x)p] | (4.26) 

without affecting the distribution of the field. This can be done as follows. 
One first updates x by a global heatbath step. To this end, one generates a 
normal random vector 7] and computes 

X'{x) = M(j){x) + r]{x). (4.27) 
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In a second step, one updates with an overrelaxation method that leaves 
the action constant, 

= C(x) - a^) = M-\{x). (4.28) 

We notice that in each update, the new x fi^ld is constructed from and a 
random vector, and it is not used after the update is finished. Therefore, it 
can be dropped and does not need to be stored in memory. 

Obviously, if the equation x = is solved exactly, this algorithm is 
completely identical with the one described above. However, we notice that 
this version can be turned into a Metropolis algorithm. By inverting the 
fermion matrix only up to a residue vector r, one can generate a proposal 0' 
for an acceptance step, 

MC{x) = x(^) + ^(2^) 
0'(x) = C{x)~<p{x). (4.29) 

For a Metropolis algorithm with a deterministic proposal, detailed balance 
is equivalent with the requirement is that the proposal transition must be 
symmetric in and 0'. As in (|4.29| ), ( is independent of the old field, this is 



the case here. The action difference in the acceptance step can be computed 
as 

AS = 5:{|M0'(a;)|2 + |x(x)-M0'(x)|2-|M0(a;)|2-|x(x)-M0(x)|2} 
= 2Re5]rt(x)(M0'(x) -M0(x)). (4.30) 

X 

For the inversion of the bermion matrix, we use a stabilized biconjugate 
gradient (BiCGStab) solver with SSOR preconditioning. A study of an im- 
plementation of this solver for the APEIOO machine and with Schrodinger 
functional boundary conditions can be found in 



4.3 The size of Cgw 

In all our simulations, we have chosen the improvement coefficient Csw by 
extrapolating the non-perturbative formulae in Nf. This ansatz is consistent 
with 1-loop perturbation theory, and also supported by the fact that the 
2-loop contribution is linear in A'^f. Nevertheless, we have computed Csw non- 
perturbatively for A^f = —2 for the bare coupling P = 8.99 to check our 
assumptions. 

By definition, improvement coefficients for 0(a) improvement have an 
ambiguity. Different choices for these coefficients correspond to different 
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O(a^) artefacts. Therefore, it is important to use the same improvement 
condition as was used in the computation of ([4.5|) . Consequently, we follow 
the method described in |^ The idea there is to use an observable 



that vanishes in the continuum limit and has only O(a^) corrections if the 
improvement coefficients are chosen properly. Such an observable can be 
deduced from the PCAC relation. 

One first observes that the computation of the current quark mass m(xo) 
in ( |2.71| ) includes the coefficient ca which is not known a priori. In order to 
eliminate it, we define 

s{xo) = ^ad^dofp{xo) (4.31) 

and write 

m(xo) = r(xo) + cas(xo). (4.32) 

From the correlation function /p at the upper boundary, one gets another 
current mass m'. Then the mass defined as 

m{yo) - m'iyo) 



M{xo,yo) = m{xo) - s{xo)- 



= rfa)-.K) ''i^°>-'-y°> (4,33) 

is independent of ca- Again, a similar quantity M' is obtained by exchanging 
variables with and without a prime. For this section, we define the current 
mass M as M(T/2,T/4). Finally, the difference 



AM = M — ,- -M' — ,- (4.34) 




is zero in the continuum and inherits lattice artefacts from the correlation 
functions used in its definition. In order to ensure the perturbative asymp- 
totics Csw{go) — * 1 for go 0, we precisely demand that AM takes its 



tree-level value at M = 33 



AaM = 0.000277. (4.35) 

This improvement condition is imposed at L/a = 8 and T = 2L, furthermore 
6k = Q and with boundary fields 

(01,02,03) = ^(-7r,0,7r) 
o 

(0;,0'2,0;j) = i(-57r,27r,37r). (4.36) 
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Using the setup described above, we have measured the current mass aM 
and aAM for some values of k at three values of Cg^. As shown in table |4.1| , 
these results are in line with the observations in ^ , which indicate that 
AM depends only weakly on the mass M. Since it is not critical whether 
the condition ( [4.35 ) is precisely imposed at zero current mass, we have not 
extrapolated the mass to zero, but instead taken the results where it roughly 
vanishes. The error introduced hereby is negligible. Our results for the lattice 
artefact aAM are summarized in table J4.2|. 









aM 


aAM 


0.13209 


1.171^ 


^15 


0.01899(12) 


0.00219(18) 


0.13213 


1.171^ 


^15 


0.01825(14) 


0.00237(21) 


0.13241 


1.171^ 


^5 


0.00946(13) 


0.00218(19) 


0.13209 


1.271^ 


^15 


-0.00030(19) 


0.00066(24) 


0.13105 


1.371^ 


^15 


0.01173(13) 


-0.00124(18) 


0.13139 


1.371^ 


^15 


0.00061(17) 


-0.00127(21) 



Table 4.1: aAM at three trial values of Csw at (3 = 8.99 and some values of k 







aM 


aAM 


1.171^ 


115 


0.0095(1) 


0.00218(19) 


1.271^ 


U5 


-0.0003(2) 


0.00066(24) 


1.371^ 


115 


0.0006(2) 


-0.00127(21) 



Table 4.2: aAM for different values of at 13 = 8.99. 



A linear interpolation of these three points to ( [4.35|) gives Csw = 1.285(7), 
which is very near to the value Csw = 1.271815 from our extrapolation in 
A^f. By 1-loop perturbation theory, one can see that this difference causes an 
error in the step scaling function smaller than 4 ■ 10^"^ which is negligible 



compared to the statistical errors as listed in table 4.6 



4.4 Measurement of the coupling 

The Schrodinger functional coupling is defined by ( |2.36| ). In a Monte Carlo 
simulation, we have to measure the average value of dS/dr], which gets con- 
tributions from the pure gauge action Sg and the bermion action S'b. In both 
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terms, one needs in the end the derivative of a boundary hnk with respect 
to T). With the given boundary conditions, we get 



da 



drj 



dCi 



rj=0 



drj 



ri=0 



(4.37) 



where we have defined fig — diag(l, —1/2, —1/2). For the gauge action, we 
evaluate 



drj 



3 t ^ 

r)=0 X k 

U{x, k)U{x + ak, 0)C/^(x + aO, k)U\x, 0) 



xo=0 



U{x, k)U\x + ak- aO, 0)U^{x - aO, k)U{x - aO, 0)1 |. (4.38) 

The computation of the bermion contribution to the couphng is shghtly more 
comphcated. It suffices to compute dM(p{x)/dr], because 



drj 



T]=0 



d_ 
drj 



^{{M<P)\x){M<P){x)} 



d 



2ReY.^\x) — {M<p{x)). 



(4.39) 



Clearly, only the clover term at xq — a and xq — T — a can contribute to the 
derivative. Furthermore the derivative of the magnetic components vanishes, 
§jFij{x) = 0, and we have ^Fok = -§jFko and aok = -CTkQ. Thus, for 
X & X — (a, x) U (T — a, x) , 



dM(j)(x) i d ^ ( A / X / ^^ 

drj 2a drj ^^^^ J 



(p{x) 



d 



8a 



«;CswX!^*;o |^Qfeo(2;) -h.c.|0(a;). 



(4.40) 



We define for each lattice point x and plane (/x, i^) the color matrix 
with elements 

^nu,ba(^) = '^i{x)a^^(l)b(x) (4.41) 

and its hermitian part XnA^) = Ctiui^) + Cl^i,{x), where we have explicitly 
displayed the color indices of the bermion fields. Using this, we obtain 



dS^ 
dr} 



?7=0 



la" 



-Im J2 i^H^)J2^ko 



' dQkojx) 
dr] 



— h.c. (j){x) 
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KCa 



4a 



4a 



ImTr Y^^koix) 

xGX k 

ImTr J2 I3Xfco(a;) 
xex k 



( dQkojx) 
[ drj 
dQkojx) 
drj 



h.c. 



(4.42) 



The computation of the derivative is analogous to the gauge action case. For 
practical purposes it is advantageous to organize the above sum in terms 
of time-like plaquettcs at the boundary. Each plaquette contributes to two 
different clover leafs. So we can write 



dr] 



-I^Tr^^(4(x) + i?,^(x 



?7=0 ^ ^ X 

where we have contributions from the lower boundary 

i?j^(x) = -U\x, 0) [inoUix, k)] U{x + ak, 0) x 

a 

xU\x + aO, k)x{x + aO) 
+ U\x + aO, k)U\x, 0) [inoU{x, k) 

xU{x + ak, 0)x{x + aO + ak) 



and the upper boundary 



(4.43) 



(4.44) 



m(x) = -U{x-a6,k)U{x-a6 + ak,0) inoU{x,k) 
a L 



X 



xU''{x-aO,0)x{x-aO) 
+ U{x - aO + ak, 0) [inoU{x, k)\ V^(a; - aO, 0) x 
xU{x — aO, k)x{x — aO + ak). 



(4.45) 



In the implementation of the measurement, bermions do not produce a large 
overhead. The computation of dSh/df] can be done in a loop together with 
dSg/drj. The effort for a measurement grows only with (L/a)^, whereas the 
cost of the same computation for A^f = +2 is proportional to the lattice 
volume. 



4.5 Tests 

Large parts of the simulation program consisted of new code; the gauge field 
update, the boson update and the measurement of the boson contribution 
to the coupling. In order to check the program carefully for correctness, it 
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is desirable to separate the test for different parts of the code as much as 
possible. 

We have done this as follows: in the implementation for unimproved 
bermions, the overrelaxation boson update there can be replaced by the heat- 
bath boson update discussed in section ^4.2.2| and setting the improvement 
coefficients Csw = 0, Ct = 1. We have confirmed that the modified program 
still yields the same results as before. 

Then we have run the improved bermion program consisting of the gauge 
field update from section 4.2.1 and the boson update from section 4.2.2 with 
various choices of the improvement parameters and checked a control variable 
discussed below. This tests the update algorithm as a whole. 

Provided the boson update on a given gauge field generates a sample 
distributed according to the action Sb, the measurement routine for dSh/drj 
can then be checked on the classical background field by comparing with the 
tree-level perturbative result. 



4.5.1 Control variable 

In general, it is advantageous to have a nontrivial observable whose mean is 
known exactly for any choice of parameters and bare couplings. This allows 
a consistency check without having to compare results e.g. with the results 
of other programs, which may not even be possible. In particular, when 
comparing with another Monte Carlo result, one compares two quantities 
which both are affected with statistical errors, whereas the comparison with 
a constant value only involves one statistical error and is therefore more 
reliable. 

In the bermion model, it is easy to construct such an observable. We 
define 

E= Jirj., . .. El^'/'(^)r. (4-46) 

l2{LlaY[{Tla) - IJ V 

where {L/aY[{T /a) — 1] is the number of sites with fluctuating boson fields 
and 12 is the number of Dirac and color components. 

As a generalization of the property of the one-dimensional Gaussian mea- 
sure, 

(x^) = 1 for P(x) oc e"^^', (4.47) 



we expect the expectation value of E, which corresponds to a path integral 
with Gaussian measure, to be 

{E) = 1. (4.48) 
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4.5.2 dSi,/dr] with a static gauge field 

By switching off the gauge field update routine, the bermion program can 
be transformed into a Monte Carlo simulation for a boson field on a fixed 
gauge field background. We have initialized the gauge fields to the classical 
background field given by (|2.26|) and run the modified simulation program. 
The fields then should be distributed like exp(— S'b[0, 0^, V^]). 

For this distribution, the result for the boson contribution to the coupling, 
dSi^/dr], can be calculated exactly using perturbation theory. We first make 
this calculation for the fermionic action S{. The fermionic contribution to 
the coupling comes from the term 



dSf 



Note that with ip, we denote Grassmann fields here. In the classical back- 
ground field, all time-like links are equal to the unity matrix, and all space- 
like links are independent of x. Therefore the magnetic components dF/dr] 
vanish. The other components are translational invariant in the space direc- 
tions, such that we can write 



(4.49) 



drj 



[f (a) + v{L — a)] 



v=o' cl 



with 



ViXo 



drj 



ip{x) 



(4.50) 



(4.51) 



r?=0 



With ()ci we denote the fermionic expectation value in the classical gauge 



field background. For xq 

dpQkix) 



drj 



a, we have 



ri=0 



dV^{x~a6,k) 



dV^{x-a6,k) 
drj 

V{x,k)\ 



h.c. 



r?=0 



Aa?L 



2a? L 



noexp||(C^-C,)} 



— h.c. 



r?=0 



VLq cos ■ 



27r 



The same result is obtained Xq 
3x3 matrix 



T 



a. 



(4.52) 

If we now introduce the diagonal 



Qr, cos ' 



27T 



(4.53) 
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and use {ilj{x)ip{x))c\ = S{x, x), where S{x, x) is the propagator in the gauge 
field background, as discussed in [pTl, we get 



vixo) = -iVf I ^-Tr l^S'^ix, x) aokj • (4.54) 
The results for v{a) and v{L ~ a) are computed for some parameter set in 



7T] and listed in table [4.3| . Since the formula for v{xq) is linear in A'^f, we 
expect the result for dS^/dr] to be of the same value, with the opposite sign. 

We have made runs with the parameter set used in table |4.3| at L/a = 
4, 6. Within the statistical errors, which had a magnitude of about 3%o, our 
numerical simulation results were consistent with these results. 



L/a 


f (a) 


v{L — a) 


4 


0.05454081493 


0.02858695070 


6 


0.01555246822 


0.00892024138 


8 


0.00622327486 


0.00398090210 



Table 4.3: Expectation values v{a), v{L — a) on the classical background field 
for Nf = +2, K = 1/8, 9 = 7r/5 and = 1- 



4.6 Tuning and error propagation 

Computing the step scaling function for a given coupling involves a couple 
of matching runs at size L and a run at size 2L. The error of the coupling at 
2L is easily obtained as the statistical error of the simulation. But also the 
statistical error at L must be taken into account when estimating the error 
of the step scaling function. For the calculation of the step scaling function 
itself, we also have to consider a mismatch. Let us assume that {(3, k) is the 
pair of bare parameters which correspond to the renormalized parameters 
g'^{L) = UQ,mi{L) = 0. Then the simulations yield results 

f{L)\f3* = ui±5ui 
fi2L)\f3. = U2±6u2, (4.55) 

with Ui being slightly different from uq, and (3* having a slight mismatch 
compared to the true /?. We use the expansion ( p. 161) to estimate 

T,{uo,a/L) ^ U2 — {T,{ui,a/L) — T,{uo,a/L)} 

= U2- {ui + squI - Mo - squI H ) 

= M2 - (mi - Mo)(l + 2soMo) + O ((mi - Mo)mo) . (4.56) 
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For the statistical error, we use the estimate 

(5S(no, alL)f ^ {du^f + (1 + 2s^uof{6uif. 



(4.57) 



In addition to the renormahzed couphng, in the tuning procedure the cur- 
rent mass mi(L/a) must be tuned to zero. In order to minimize the amount 
of computer time going into the tuning runs, it is helpful to know which 
precision is needed in the computation of the current mass. In perturbation 
theory one can estimate how a mismatch in the mass propagates into the 
value of the step scaling function |7^. One defines the derivative of S with 
respect to 2; = miL, 



d_ 

dz 



g'^{L)=u, mi{L)L=z J 



2=0 



(4.58) 



By approximating this derivative by its continuum value and using the func- 



tion Ci^i(z) computed in |jT^, one gets 



s;(a/L)^s;(o) 



An dz 



0.00957 Nf. 



(4.59) 



We conclude that tuning the current mass to a residue value 0.001 for example 
on an L/a = 8 lattice leads to an error in the step scaling function smaller 
than 0.0002 M^, which can be neglected compared to our statistical errors. 
Table contains the results of our tuning runs and shows that the measured 
masses are in a safe range. We have also found that the best tuning runs 
for a given pair of g and mi allow for an interpolation to mi = 0. In those 
cases, we have listed the value zero with the error given by the fit routine in 
the table. 



4 . 7 Performance 

For the feasibility of our updating strategy, a reasonable acceptance rate 
in the gauge field update is essential. In particular, there is no tunable 
parameter which could be used to optimize this component of the update. In 
the range of bare couplings we have used in our simulations, the acceptance 
rate does not vary much. It is roughly 76% at = 0.9793, and with 70% 
slightly worse at g"^ = 1.5145. 

In the boson sector of the algorithm, the stopping criterion of the solver 
can be optimized for performance. In the solver implementation we use, 
the iteration scheme is finished as soon as the norm of the residue vector is 
smaller than e times the norm of the solution vector of the preconditioned 
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system. A solution of the system of equations to machine precision amounts 
to setting ^ lO"^"^. 

Since the cost of the whole update process is dominated by the gauge 
field update, we could not obtain a great advantage by increasing e. In test 
runs on a 4^ lattice, the performance maximum is pretty flat in e, and the 
decrease in cost only about 10%. On larger lattices, one expects that the 



accuracy of the solver has to be scaled [67|, so that we expect the benefit of 
tuning this parameter to even diminish on the large lattice. As determining 
autocorrelation times accurately requires large statistics, a detailed perfor- 
mance study on large lattices would be very expensive. Therefore, we have 
dispensed with this and instead set the solver precision to a safe value of 
= 10^^°, which gave an acceptance rate of more than 99%. 
For a comparison of the efficiency of our algorithm with the one for unim- 
proved Wilson bermions and with simulations with dynamical fermions, we 



use the measure Mcost as defined in |7^. It is machine dependent, with our 



reference computer being an 8-node APEIOO, and is defined as 

M^ost = (update time in seconds on machine M) 

X (error of g-^f X (4a/T) {Aa/Lf. (4.60) 

According to this definition, if we want to compute g^'^ with an absolute 
precision of 0.01 on a 4"^ lattice, we need 10000 ■ Mcost seconds update time 
on a Ql machine. The focus on the absolute error of g~^ is motivated by 
its relevance in the recursive scheme used to compute the A parameter. In 
1-loop approximation, the expansion ( p.l6|) can be written as 



-g-\2L) ^ -g-\L) - s,. (4.61) 

This means, under an application of the step scaling function, the error 
of g^^ is conserved. In a recursive computation of the coupling at scales 
L = 2~'^Lmax, the error of g~^ in each step is equally important. 

The cost of our algorithm for the simulated lattice sizes at g"^ = 0.9793 is 



shown in table |4.4| and figure ^]1] in comparison with the algorithm for unim- 
proved bermions. When we fit these data with a linear ansatz, excluding 
the L/a = 4 lattices, we find that the cost both for the improved and unim- 
proved theory scales with about a~^'^. The overhead of improvement however 
turns out to be about a factor 12. Considering that unimproved bermions 
are only a factor 3 more expensive than the quenched approximation, this 
is quite large. In section ^4.8| we will argue that this overhead is nevertheless 
more than compensated by a substantially improved extrapolation to the 
continuum limit. 
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L/a 




Meost 




improved 


unimproved 


4 


0.061(2) 


0.00535(7) 


5 


0.107(3) 


0.00866(13) 


6 


0.212(6) 


0.0155(2) 


8 


0.457(11) 


0.0319(4) 


10 


0.790(17) 




12 


1.30(3) 


0.0788(12) 



Table 4.4: Cost of the measurement of the couphng g"^ = 0.9793 for improved 
and unimproved bermions. The last two entries for improved bermions are 
at f ^ 1.11. 
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Since our motivation for studying the A^f = — 2 model was the fower cost 
in comparison with dynamical fermions, it is also interesting to compare with 
data from We find that on a L/a = 12 lattice, our implementation of im- 
proved bermions is about a factor 10 cheaper than simulations of dynamical 
fermions. Also, our algorithm scales slightly better. 



4.8 Results 

Our simulation program was implemented in the TAO language for the 
APEIOO parallel computer. To give an impression of the cost, the run at 

= ZI(0. 9793, 1/16) needed roughly 20 days on a QH2 machine with 256 
nodes. During this time, 20000 measurements of the couplings have been 
accumulated in each of 16 replica. We have also ported the program to the 
slightly changed APEmille environment. Without taking advantage of many 
improvements like the extended register file, the performance gain is not 
more than a factor 3. About half of the statistics for the measurement of 
(j(1.5145, 1/16) has been collected on an APEmille crate with 128 nodes. 

The measurement of the coupling is quite cheap and its autocorrelation 
time only up to about 6 in units of updates (while other observables like the 
average plaquette are correlated over 10 updates). Since our main interest 
is in the coupling, we have measured it after each update. In contrast, the 
measurement of the correlation functions /a and /p is quite costly. Also, the 
mass mi fiuctuates little and is not required with a great precision. Therefore, 
we have only measured these correlation functions after every 100th or 150th 
update. For the estimation of statistical uncertainties, we have employed the 
methods described in appendix 

Our results for the tuning runs are listed in table The corresponding 



results at the same bare parameters are summarized in table [4.6| . 

By applying the techniques for the propagation of the coupling mismatch 
and error propagation described in section we obtain a/L) for L/a = 



4,5,6,8 as shown in figure L2. In the figure linear in (a/L)"^, all points lie 
on a straight line with no visible appearance of a/L effects. So we fit with 
the ansatz 

S(m, a/L) = a{u) + p{u){a/Lf. (4.62) 

In both cases, the slope p{u) turns out to be compatible with zero within 
the errors. In the computation of the step scaling function in the quenched 
approximation, it has been observed that the Schrodinger functional coupling 
follows perturbation theory to quite low energies. Therefore, we also display 
the 2-loop and 3-loop perturbative values in the diagram, which we obtain 
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L/a 


/3 


K 


9\L) 


mi (L/a) 


4 


10.3488 


0.131024 


0.9793(19) 


0.00000(31) 


5 


10.5617 


0.130797 


0.9795(21) 


0.00055(13) 


6 


10.7302 


0.130686 


0.9793(11) 


0.00000(5) 


8 


11.0026 


0.130489 


0.9793(14) 


0.00000(6) 


4 


8.3378 


0.132959 


1.5145(23) 


0.00000(28) 


5 


8.5453 


0.132637 


1.5145(17) 


0.00000(7) 


6 


8.70830 


0.132433 


1.5145(33) 


0.00000(4) 


8 


8.99 


0.13209 


1.5145(33) 


0.00066(8) 



Table 4.5: Parameters and results for the coupling and the mass at L. 



L/a 




mi{2L/a) 


4 


1.1090(28) 


-0.00300(10) 


5 


1.1079(29) 


-0.00086(5) 


6 


1.1053(30) 


-0.00094(4) 


8 


1.1093(40) 


-0.00025(3) 


4 


1.8734(74) 


-0.00266(12) 


5 


1.8648(82) 


-0.00094(7) 


6 


1.8488(86) 


-0.00070(5) 


8 


1.8741(100) 


0.00002(5) 



Table 4.6: Results for the coupUng and the PCAC mass at 2L at the bare 
parameters given by the listed values of g^{L). 



u a{u) a^-^°°^{u) a^-'°°P(M) 

0.9793 1.1086(72) 1.10435 1.10546 

1.5115 1.8713(167) 1.85122 1.85938 

Table 4.7: Extrapolated simulation results and perturbation theory of the 
massive step scaling function for N{ — —2. 
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Figure 4.2: Step scaling function for improved berniions for the couplings 
u = 0.9793 and u = 1.5145 with fits linear in (a/L)"^. Also shown is the 
extrapolated continuum value and the 2- and 3-loop values. 
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from IP and ( |2.16| ). For a direct comparison, these numbers are summarized 
in table |4.7| . It can be seen that our simulation results are consistent with 
perturbation theory within the errorbars. It can also be seen that the 3-loop 
contribution is of the same order of magnitude as our statistical error (and 
much larger than in the Nf = +2 case). 
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Figure 4.3: Results for the step scaling function at u = 0.9793, together with 
a quadratic fit under the constraint of universality. 



Now we want to compare our data with the results for unimproved Wilson 



bermions from |^. That study covers the step scaling function aiu = 0.9793 
with resolutions L/a = 4,5,6,8,12. Figure ^]3| shows both data sets in a 
common diagram against L/a. While the unimproved data points clearly 
show cutoff effects of a few percent, the improved ones are even consistent 
with a constant within the statistical errors. The diagram shows a joint fit of 
the improved data with a quadratic term and of the unimproved data with 
a linear and quadratic term. In this combined fit, the constraint has been 
imposed that both curves have the same continuum limit. The continuum 
limit from this procedure comes out as cXcombmed (0.9793) = 1.1059(43) which 



76 



is minimally different from the value in table [4.7| . This is a strong indication 
that universality holds. 

An interesting question is whether improvement is profitable. Fitting 
the unimproved data alone with a linear plus quadratic function yields a 
continuum limit of o"unimproved(0.9793) = 1.103(12), i.e. the error estimate 
is by a factor 2.7 higher than for the fit in figure [4.2| . On the other hand, 
the computational cost was only a factor 1.7 less. From these numbers, 
the conclusion is that the implementation of 0(a) improvement is a cost- 
effective way of enhancing the precision of the continuum limit. Surely our 
observations also amend our trust in the extrapolation to the continuum 
limit. 

We should note that this is a quantitative comparison against the Wilson 
action without any improvement. Experience with the Schrodinger func- 
tional boundary conditions in the pure gauge theory indicates that a large 
part of the lattice artefacts is caused by the time-like boundaries, which are 
absent in the usually simulated torus setups. These cutoff effects can be very 
effectively eliminated by setting ct to its perturbative value, with a minimal 
implementation overhead. 

So the more interesting effect is the one by the clover term. For the 
simulation of dynamical fermions, the balance will bend more towards im- 
provement than for bermions. One point to consider is the critical slowing 
down towards the continuum limit. For bermions, a trustworthy extrapo- 
lation of the unimproved data points needed simulations up to L/a = 24, 
whereas we were satisfied with L/a = 16 for the improved case. The Hybrid 
Monte Carlo like algorithms for dynamical fermions scale worse than our 
bermion algorithm, and therefore going to larger lattices is more costly. The 
other point is that our algorithm for improved bermions is very expensive 
compared to the unimproved program. In a Hybrid Monte Carlo program for 
A^f = +2, improvement amounts to the modification of the Dirac operator. 
Hence the inclusion of the clover term implies a much lower overhead, for 
example about 20% as reported in [|7^ . 



Finally, we also want to compare our data with quenched results and 
data for N{ = +2 as published in [jlT| . While we use the 2- loop value for the 
improvement coefficient Ct, it was only known to 1-loop at the time when the 
data for Nf = 0, +2 was produced. 

The cases Nf = and also Nf = —2, as discussed before, look well 
behaved. One can fit both of them linearly in (a/L)'^. When the data point 
L/a = 4 is left out, both data sets are also consistent with a constant, which 
is an indication that higher orders in a do not play a significant role. 

The Nf = +2 case looks more disturbing, and actually this was one of the 
main points that led to the consideration of the bermion model. Simulation 
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Figure 4.4: Step scaling function for N{ = —2, 0, +2 in comparison, in a 
diagram linear in {a/L)^. The 3-loop perturbative values are displayed as 
triangles. 



runs at different couplings have however produced data sets in which the 
L/a = 5,6 points differ less |TT|]. One may conclude that in this figure, the 
point at L/a = 6 has a large statistical fluctuation. Nevertheless, this plot 
shows that one must be careful with using expectations from the improvement 
programme for the extrapolation of data. In particular, simulations closer to 
the continuum limit are evidently needed. 



4.9 Results with perturbative corrections 

The size of remaining cutoff effects in the step scaling function can be com- 
puted in perturbation theory. In order to get an estimate, one expands the 
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relative deviation from the continuum limit in a series 

a/ L) — (j{u) 



6{u, a/L) 



(5io + 5uNi)u + (520 + 52iN( + 522Nf)u^ + 0(m3). (4.63) 



One can hope to smoothen the approach to the continuum limit by cancel- 
ing the perturbative contribution to the remaining cutoff effects from the 
Tj{u,a/L) data from Monte Carlo simulations. Here we also study the cor- 
rected values 

«/^) = 1^,, /n'i^r in ^ (^.64) 

1 + oi[a/L)u + 02[a/L)u^ 



with 5 coefficients computed to 2-loop order Figure ^4.5| shows the data 
for T.^'^\u, a/ L) in comparison with the uncorrected data T.(u,a/L). These 
points are fitted in the same way, i.e. linear in {a/ and without the point 
L/a = 4. The perturbative cutoff effects are quite small and do not change 
the slope of our extrapolations in a systematic way. The L/a = A points 
are even pushed away from the extrapolated value. As the perturbative 
corrections are within the statistical fluctuations of our data, no general 
statement about the success of this approach can be made. However we note 
that the extrapolated value changes only minimally. 



4.10 Lattice artefacts in the current mass 

Another test of the improvement programme and in particular of the PCAC 
relation are the lattice artefacts in the fermionic observables. Our definition 
of the step scaling function is such that the current mass mi(L/a) is tuned to 
zero on the small lattices. We then expect that mi{2L/a) measured with the 
same bare parameters vanishes in the continuum limit and has cutoff effects 
of order O(a^). In order to take into account a small mismatch in mi{L/a), 
we examine the quantity mi{2L / a) — mi{L / a) which is plotted against iajVf' 
in figure The hnear fits in the data points assert the expected scahng 
behavior. Only for = 1.5145 we can see a small deviation which may stem 
from a statistical fluctuation and is not a sign of pathological behavior. 

Also here it is interesting to see this observable in comparison with the 
cases Nf = 0, +2. Figure shows the data points for the coupling g"^ = 



0.9793 in a common diagram. This plot also shows the perturbative data, 
which is available for integer L/a, connected with dotted lines. As mi in 
( p.72| ) is defined differently for even and odd lattice sizes, the perturbative 
points do not lie on a smooth curve. 
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1.9 ■ 



\ 1.88 ■ 

a 




(a/L)^ 

Figure 4.5: Step scaling function for improved bermions for the couplings 
u = 0.9793 and u = 1.5145 with fits linear in (a/L)'^. The rectangles represent 
the data points obtained from our simulations, whereas the circles represent 
the data S^^^ corrected by perturbation theory. 
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Figure 4.6: Lattice artefacts in the PCAC mass mi{2L/a) at = 0.9793 
and g'^ = 1.5145. 
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Apparently the points for A^f = 0, +2 do not follow an O(a^) behavior as 
well. They still behave inoffensively for a —>■ and are similar to perturbative 
expectations. One may infer that higher orders in a provide the dominant 
contributions to the cutoff effects here. 
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Figure 4.7: Lattice artefacts in the PCAC mass at u = 0.9793 for dif- 
ferent flavor numbers. The dashed hnes represent (from bottom to top) 
Nf = —2, 0, +2 perturbative results. Non-perturbative data are represented 
as rectangles, pentagons and circles. 
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Chapter 5 

Decoupling of heavy flavors 



Up to now we have only discussed massless schemes, i.e. schemes in which 
the renormahzation conditions do not depend on the quark masses. This 
approach has the advantage that the renormahzation group equations assume 
a simple form and renormahzation constants do not have a dependency on 
multiple scales. 



For unbroken, non-abelian gauge theories, Appelquist and Carazzone 



have proven a decoupling theorem. It states that heavy fields coupled to the 
massless gauge fields do not contribute to the infrared behavior of the theory, 
apart from renormahzation effects. 

The MS scheme - which is certainly the most popular one for high-loop 
perturbation theory calculations in QCD - is a massless scheme. While this 
approach is in general quite practical, it fails in full QCD with six non- 
degenerate fiavors, when the coupling is needed at different scales in the 
range of the quark masses. For example, the running coupling at an energy 
of 500 MeV obtained by integrating the renormahzation group equation con- 
tains contributions from all six quark flavors, although three of them are so 
heavy that they influence the physics on this scale only in a negligible way. 
Since the /3-function in this case provides an "unphysical" running coupling, 
this has to be compensated by the expansion coefficients of the perturbation 
series. Contributions from the heavy quarks must hence be present in all or- 
ders of this series in the form of logarithms of large quark masses. The reason 
for the lack of decoupling is that the /9-function and the running coupling as 
are not physical observables. Therefore, the decoupling theorem cannot be 
applied directly to them. 

This problem can be circumvented by using different effective theories at 
different scales. At energies far above the top quark mass, one starts with a 
/3-function with six active flavors. When the energy is lowered under the top 
quark mass, one considers an effective /^-function with flve active flavors and 
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a different A parameter. This method is continued at the ffavor thresholds 
at lower energies. In this way, the decoupling is implemented by hand. In 
general, at the matching scales where one switches from one effective theory 
to the other, the running coupling is discontinuous. Consistency of the pairs 
of theories below and above the quark masses implies a set of matching 
conditions for the running coupling and relations between the A parameters. 
They have been systematically worked out up to 2-loop in by connecting 



the theories above and below the threshold to the MOM scheme, where the 



decoupling is explicit. An introduction into this topic can be found in ||TJ 

In our discussion of the Schrodinger functional, we have restricted our- 
selves to a mass-independent step scaling function. Since the up and down 
quarks are in good approximation massless, this is an appropriate approach 
for the low energy regime. At some point one may however want to do 
research on the running coupling with massive quarks included. As in the 
/3-function in the MS scheme, there is no decoupling in the mass-independent 
step scaling function. One could work around this difficulty by using step 
scaling functions with different numbers of flavors in different energy regimes. 
This leaves the problem of connecting the different regimes. 

By using a mass dependent step scaling function, one can move through 
the flavor thresholds. For example, above the charm quark threshold one 
simulates an action with A^f = 4 quarks. When the scale is made smaller than 
the charm quark mass, its influence on the step scaling function gradually 
vanishes. At some scale the effect of the charm quark becomes negligible in 
relation to the precision aimed at, and one can switch to an action with only 
Af = 3 simulated quarks. 

Unfortunately, heavy quarks on the lattice can cause sizable lattice arte- 
facts and make extrapolations to the continuum limit impossible since one 
is restricted to lattice sizes that can be simulated on the computer. With 
the method described above, one still has to be able to simulate the theory 
with the heavy quark included down to the matching point, where the change 
to the theory without the heavy quark can be made. The question then is 
whether one can find a mass cutoff at which the heavy quark has already 
decoupled, and at the same time lattice artefacts are still small. 

Figure ^?T] illustrates how the /5-function changes at 1-loop level when the 
massQ is increased. The dashed line shows the behavior in the MOM scheme 
[]79| . As chiral symmetry holds, the /3-function is an even function of such 
that the asymptotic behavior is proportional to for small masses and 1/ z'^ 
for large masses. Therefore, the decoupling is quite abrupt. 



^In the Schrodinger functional scheme parametrized by z = m^L, in the MOM scheme 
by z = m^/iJ.. 
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Figure 5.1: Contribution to the 1-loop coefEcient of the (3-function per Qavor. 
The solid curve shows the behavior in the Schrddinger functional scheme for 
9 = 7r/5, whereas the dotted curve shows the MOM scheme. 



In the Schrodinger functional scheme, as represented by the sohd curve, 
the transition region is broader^. The temporal fermionic boundary condi- 
tions here violate chiral symmetry even in the continuum limit, hence terms 
odd in z are possible. Nevertheless, in it is argued on the ground of 



a 1-loop estimation that when a quark with z > Zcut = 2 is omitted from 
the /3-function, an error in (^"^ of only 0.003 is made, which is well below 
both the typical statistical errors in the ALPHA programme and experimen- 
tal measurements. A computation of the lattice artefacts to 1-loop indicates 
that lattice artefacts even for several flavors up to 2; = 2 are quite small 
and follow (a/L)^ expectations when a Symanzik improved action is used. 
Putting these observations together, one can conclude that it is feasible to 
simulate the range of masses up to Zcut = 2 in the massive theory and switch 



^The precise shape of the curve depends e.g. on the parameter 6. Therefore, the "dip" 
in the right half of the plot should not be overrated. 
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to the theory with the heavy quark omitted for higher z. 

In this chapter, we are going to corroborate whether this statement holds 
non-perturbatively. As in the previous chapter, we use the bermion model 
as a relatively cheap testing ground. By computing the step scaling function 
for increasing z, we test the transition from the massive A^f = —2 theory to 
the massless iVf = theory. 



5.1 Theory 



We now introduce some of the concepts used in this chapter. We mostly 



keep the notation used in . Results from there are quoted without further 
notice. 

As in the massless case, the scale dependence of the coupling is given by 
the Callan-Symanzik /3-function, 



(5.1) 



where the derivative is understood to be taken with the renormalized pa- 
rameters giLo) and m(Lo) fixed at some scale Lq. The coefficients in the 
expansion of the /3-function 



are dependent on the mass parameter 

z = m{L)L. 



(5.2) 



(5.3) 



The step scaling function is generalized in a similar way. On the smaller 
lattice, the bare parameters are tuned such that the renormalized coupling 
and the parameter z take their prescribed values. The step scaling function 
then is the value of the renormalized coupling on a lattice with twice the size 
and the same bare parameters. 



cr M, z] 



(5.4) 



g{L)=u,m{L)=z / L 

In a recursive computation of the running coupling, the quark mass runs, and 
therefore in each iteration step the parameter z has to be adjusted depending 
on the scale. This can be done by complementing the step scaling function 
for the coupling with a step scaling function cTp for the mass, generalizing the 
definition in [BOl, 



(Jp[U,Z) 



Zp{go,2L/a) 



lim 

a^o Zp{go,L/a) 



(5.5) 



g{L)=u,m{L)=z/ L 



86 



In order to find the mass dependence of quantities in perturbation theory, 
it is convenient to relate their coefficients to a massless scheme like MS. In 
the expansion 



9h = f+"-^t + 0{g% (5.6) 



the coefficient Ci {z) has a pure gauge theory contribution and a mass-dependent 
contribution proportional to the number of flavors, 

ci(-2) = ci,o + iVfCi,i(2;). (5.7) 

For large the quark contribution diverges as 

ci,i(^) = ^log(^) + 0(lA), (5.8) 

with corrections depending on the phase 9 in the fcrmionic boundary con- 
ditions. Denoting g"^ = g'^{L)\z=o, this logarithmic divergence shows up for 
example in the mass dependence of the coupling at fixed scale, 

f-g' = -^{c,,,{z)-c,,M}9' + 0{g') 

-^,log{z)g' + 0{g% (5.9) 

The massive step scaling function in 1-loop can be computed by switch- 
ing the coupling to the MS scheme at the scale L and going back to the 
Schrodinger functional scheme at the scale 2L and the mass parameter 2z. 
We obtain 

a{u, z) = u + so{z)u'^ + 0{u^) (5.10) 



with 



So{z) = ^{ci,i(z) -ci,i(2z)} + 21og26o(0). (5.11) 



For large z, the mass dependent contribution to the step scaling function 
converges to 

a{u, z) - a{u, 0) ^ log2 + 0{u'), (5.12) 
which is just the difference between the theories with A^f and zero quarks. 
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5.2 Renormalized mass 



The connection between the renormahzed and the PCAC mass is given by 
the relation ( ^.7^ ). For our matching runs, we have always approximated 
z m\L. In the following, we will discuss the neglected quantities appearing 
there. 

In perturbation theory, the improvement coefficient 6a ~ is |^ 



(6a - 6p)(^7o) = -0.00093(8)(7o' + ^{gt)- (5-13) 

In the quenched approximation, this value was computed non-perturbatively. 
It turns out that only for /3 -C 7 one obtains values that are significantly 
larger in magnitude. As this improvement coefficient is multiplied with the 
subtracted mass mq, the overall effect on the mass is very tiny compared to 
our statistical error. 

The renormalization factor Za of the axial vector current is scale inde- 
pendent. To 1-loop in perturbation theory, it is given by [^] 

Za((7o) = \^zl\l^O{gt) 

Z^^ = -0.116458(2). (5.14) 

In the quenched approximation, the non-perturbative determination of 
deviates about 2 % from this perturbative formula at the bare couplings used 
here. 

The axial density renormalization factor Zp can also be expanded in the 
bare coupling. The coefficients are functions oi L/a and diverge as powers 
of log(L/a) in the continuum limit WM, 



Zp{go,L/a) = 1 + zi.'\L/a)gl + 0{g'o) 
Z?{L/a) = ^zp{e,p) - dolog{L/a) + 0{a/L). (5.15) 



Here we have followed the notation of where the calculation was done 
for some choices of 6 and p = T/L, and without background field. 

By neglecting the renormalization factors we make a systematic error in 
z of 

Sz/z = 6m/m = (Z« - Z«) gl + 0(^7o'). (5.16) 

Numerically, the scale independent part of Zp^ cancels Z^^ almost exactly. 
The logarithmic term causes Sz/z to vary between roughly 0.051 on the 
L/a = 4 lattices and 0.071 on the L/a = 8 lattices, for the bare couplings 
we have used. On the one hand, this leads to a simultaneous shift of all data 
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points for a fixed z. On the other hand, it leads to different errors in the 
step scahng function with fixed z but different Lja. The latter error can 
be estimated as follows. When we neglect the mass dependence of lattice 
artefacts, an error in z propagates into the step scaling function as 

bo{u,z)^ bzs'^{z)u^. (5.17) 

The derivative of s^i^z) with respect to z can be computed numerically using 



data from [|T^. The largest effect appears for 2; = 2, where 

So(2) = 0.00349 ATf. (5.18) 

The resulting error in o"(m, 2;) then is less than 0.0008, which is an order 
of magnitude below our statistical error. Although the use of several as- 
sumptions from perturbation theory with vanishing background field in this 
estimate is quite unsatisfactory, we believe it to be not too unrealistic. 



5.3 Results 



In table |5Tl| , we list our results of the step scaling function for the lattice 
sizes L/a = 4, 5, 6, 8 and for values of the mass z = 0.5, 1, 2. In addition to 
these results, we use for 2; = the results at g'^ = 1.5145 from chapter ^ as 
listed in table |46| . Figure |5]^ shows a plot with all results. 

Although the error bars are roughly of the same size as the variation of 
the step scaling function for different L/a, when z is kept constant, one sees 
a slight trend that lattice artefacts become larger for increasing mass. Since 
for L/a = 4, the value 2; = 2 is already equivalent with ma = 1/2, this is 
to be expected. Nevertheless, the lattice artefacts do not pose a problem of 
principle for the extrapolation of our data with an ansatz linear in (a/L)^. 

We have fitted our data with the L/a = 4 point dropped. This extrapo- 



lation is shown in figure ^.2| . To see how much the L/a = A point influences 
the extrapolation, we have also computed the fit with this point, and listed 
the results in table |5.2| . All these numbers are also to be compared with the 
perturbative values 

<T=^-i°°P(1.5145,0)|jVf=-2 = 1.859385 
o-^~'°°P(1.5145,0)|7Vf=o = 1.806922. (5.19) 

In the figure, these 3-loop results are shown as red and black triangles, 
respectively. From these numbers, it is clear that our accuracy is not sufficient 
for resolving the step scaling functions with 2; = 0.5, 1,2. In fact, when we 
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Zl 


T 1 

L/a 






—2fT\ 

9 (L) 




0.5 


4 


8.274636 


0.129184 


1.5145(15) 


1.8225(53) 


0.5 


5 


8.474702 


0.129606 


1.5145(16) 


1.8393(69) 


0.5 


a 
D 


8.D51298 


0.129873 


1.5145(24) 


1.8270(82) 


0.5 


8 


8.920407 


0.130185 


1.5145(35) 


1.8320(111) 


1 


4 


8.227619 


0.125497 


1.5145(13) 


1.8095(52) 


1 


5 


8.426126 


0.126580 


1.5145(16) 


1.8158(65) 


1 


6 


8.599670 


0.127310 


1.5145(30) 


1.8152(91) 


1 


8 


8.855909 


0.128237 


1.5145(41) 


1.8311(99) 


2 


4 


8.166523 


0.118802 


1.5145(13) 


1.7975(49) 


2 


5 


8.357652 


0.120885 


1.5145(17) 


1.8096(87) 


2 


6 


8.521765 


0.122377 


1.5145(20) 


1.8246(88) 


2 


8 


8.793347 


0.124347 


1.5145(33) 


1.8342(97) 



Table 5.1: Tuning results for (5 and k for the given values of z and L/a, 
together with the measured coupling at L and 2L. 



z 


without L/a — A 


with L/a — A 





1.871(17) 


1.859(11) 


0.5 


1.820(18) 


1.840(11) 


1.0 


1.831(19) 


1.833(11) 


2.0 


1.851(18) 


1.844(11) 



Table 5.2: Extrapolated simulation results of the massive step scaling func- 
tion (7(1.5145, z) for N{ = -2. 







n= 1 


n = 2 


n — 3 


-n- 


-'°°P(1.5145,0)|Arf=-2 

-i°°P(1.5145,0)|7Vf=o 


1.811555 

1.773940 


1.851225 

1.803887 


1.859385 

1.806922 




-^°°P(1.5145,0)|Ar,=-2 
-'°°P(1.5145, 0)1^^=0 


1.762845 
1.735997 


1.828158 
1.788089 


1.849306 
1.801806 



Table 5.3: Perturbation theory for the step scaling function for — —2 and 
Nf — to 1-loop, 2-loop and 3-loop. 
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0.02 0.04 
(a/L)^ 



0.06 



Figure 5.2: Step scaling function at u = 1.5145 for z = 0,0.5, 1,2 with fits 
linear in [a/L)"^. Please note that this figure is meant to be viewed in color. 



leave out the L/a = 4 point, the extrapolated values appear in the "wrong" 
order, but with a statistical error that is roughly of the same magnitude as 
the difference between the values. 

In [0, the lattice artefacts parametrized as in ([4.63|) were calculated to 
1-loop for the massive step scaling function, both for an improved and unim- 
proved action. The mass definition used there is the pole mass, which is to 
1-loop equivalent to our mass definition with the background field switched 
off. At least for z = 0, the lattice artefacts do not differ much between differ- 
ent mass definitions, so that the qualitative behavior is probably independent 



of it 17^ 



The lattice artefacts we see in our simulations are of the same magnitude 
as those computed perturbatively. The perturbative lattice artefacts also 
behave like (a/L)^, which is consistent with our observation that an ansatz 
linear in (a/L)^ works. It would be desirable to have the 6i{a/L) values or 
even S2{a/L) for z 7^ with our mass definition. Then one could use the 
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method described in section [4.9| to cancel the perturbative contributions to 
the lattice artefacts. Since the lattice artefacts cause the step scaling function 
at finite a/L to decouple faster than in the continuum, this method should 
reduce the slope of the fit, which here depends on the mass parameter. 

We would also like to note here that the size of lattice artefacts depends 
on the number of flavors. With A^f = —2 used here, the quark contribution 
has the same sign as the pure gauge theory contribution. This situation 
changes for iVf = 2 and A^f = 4. There, the perturbative lattice artefacts, for 
instance with z = 2, are smaller than in the pure gauge theory. 

A quantitative comparison of our extrapolated data with perturbation 



theory is difficult. In table 5.3, the perturbative results for the step scaling 



in 1-loop to 3-loop are listed, according to the definitions ( 2.18 ) and ( |2.19 ) 



Clearly, the 2-loop and higher contributions are so significant that a com- 
parison of non-perturbative results to the 1-loop results is impossible. On the 
other hand, the massive /5-function has not been computed beyond 1-loop. 
An alternative to studying absolute values of a{u, z) might be to consider 
only the mass dependent contribution a{u,z) — a{u^Q). But also for this 
quantity, higher order terms problem. We expect 

a{u,z)\Ni=-2 - cr{u, 0)|Arj=_2 cr{u,0)\Nt=o - a{u,0)\Nt=-2- (5.20) 
But to 1-loop, 

ai~'°°P(1.5145,0)|w,=o -'7^"^°°P(1.5145,0)|Arj=_2 = -0.038 
a^-^°°P(1.5145,0)|7Vf=o-fx^"'°°^(1.5145,0)|iVt=-2 = -0.027. (5.21) 

This indicates that O(m^) corrections to this difference are of the same order 
of magnitude as the ^-dependence we want to investigate. 
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Chapter 6 

Summary and Outlook 



In this thesis, we have investigated aspects of the step scahng function in the 
Schrodinger functional scheme. 

In chapter ^ we have compared the efficiency of various algorithms in pure 
SU(3) gauge theory. The emphasis in this study is on the coupling g, which 
is directly relevant for the computation of the step scaling function. The 
favored method for this kind of theories is the Hybrid Overrelaxation (HOR) 
algorithm. However, this algorithm is not easily generalizable to theories 
with an action that is quadratic in the individual link variables. The other 
algorithms in this study are based on molecular djTiamics. While the global 
Hybrid Monte Carlo (HMC) is the standard algorithm for the simulation 
of dynamical fermions, the Local Hybrid Monte Carlo (LHMC) is a local 
version of it. The important results of this study are a factor of 3 in the 
cost of LHMC, and even a factor of 26 for HMC compared to the HOR 
algorithm, for the lattice size L/a = 8. These numbers may for instance 
serve as a guideline for the choice of suitable algorithms for actions with a 
pure gauge theory like term plus a correction. The bermion model studied 
in this work is an example for such a theory. 

Chapter H is devoted to the step scaling function for two flavors of massless 
bermions with 0(a) improvement. The motivation for considering this model 
is the reduced computational cost in comparison with dynamical fermions. 
We have studied in detail the approach to the continuum limit and lattice 
artefacts. We have found that lattice artefacts follow the expected behavior 
and an extrapolation linearly in (a/L)^ works well and yields results compat- 
ible with renormalized perturbation theory. A comparison with unimproved 
Wilson bermions illustrates the success of 0(a) Symanzik improvement and 
confirms that the unimproved and improved theory converge to a univer- 
sal continuum limit. Unfortunately, algorithmic difficulties arising from the 
inclusion of the clover term cause a large overhead of our simulation pro- 
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gram compared to the unimproved one, so that the efficiency advantage over 
fermions is only a factor 10. Consequently, we cannot reach lattice sizes 
much larger than with dynamical fermions. Nevertheless, our results put the 
methods used in the Nf = 2 case on a ffimer ground and increase our trust 
that a solid determination of the A parameter for two massless fermions can 
soon be finished. The running coupling for A^f = 2 will also serve as the basis 
for quark mass renormalization and phenomenological investigations. 

In chapter ^ we have investigated the massive step scaling function in the 
bermion model. Since heavy quarks are expected to decouple at low energies, 
this quantity can be used to move through a flavor threshold, below which 
one can switch to an effective theory without the heavy quark. Our aim 
was to corroborate statements from previous perturbative studies, which are 
essential for the feasibility of the recursive evaluation of the running coupling 
in QCD with several massive quarks. We have found lattice artefacts to be 
of tractable magnitude in our range of simulated masses. The extrapolation 
of the massive step scaling function showed that the decoupling effect is not 
totally different from our expectation. Resolving different values of the mass 
correctly would require to increase our statistical precision significantly in 
comparison to the mass dependence of the step scaling function. 
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Appendix A 
Notation 



As a convention, we use letters . . . from the middle of the greek alphabet 
to denote Lorentz indices running from to 3. When we are referring to 
spatial components only, we use latin letters k,l, . . .. Dirac spinors carry 
indices A, B, . . .. Color vectors in the fundamental representation of SU(A^) 
have indices a, j3 from the beginning of the greek alphabet. Repeated indices 
are always summed over. 

We distinguish space-hke 3- vectors x = {x-i,X2,x^) from 4- vectors x — 
(xo, Xi, ^2, x^) by a bold font. 

SU(2) matrices 

Any 2x2 matrix can be written in the form 

w^wo + iwkak- (A.l) 
where are the Pauli matrices 

The set of complex coefficients are called the quaternionic representation. In 
this representation, 

det w = w^w^ 

Trw = 2wo. (A.3) 

For SU(2) matrices w, these coefficients are real and fulfill WfjWf^ — 1. 

Another possible parametrization of SU(2) is the exponential mapping 
with group generators. This will not be used here, but only in the SU(N)case 
where iV > 3. 
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SU(A^) matrices 



Every S\J{N) matrix U can be parametrized by 

[/ = exp < iTaUJa \ ■ 



(A.4) 



a=l 



The coefficients LOa are real- valued. Ta G su(A'") are called generators of the 
Lie algebra associated with the group. Their representation is a,s N x N 
matrices. They fulfill 

= -X and TrX = 0. (A.5) 
For SU(3), we write 7} = Xj, where Xj are the Gell-Mann matrices 
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/ 1/^3 
l/^/3 
V -2/x/3 






-1 |(A.6) 



(A.7) 



(A.8) 



Haar measure 

As integration measure for SU(iV), we use the Haar measure dU. It has the 
(for a gauge theory natural) property of being invariant under left and right 
multiplication with group elements, 

J dUfiU) = J dUfigUh-'), g,he SU{N) (A.9) 

and it is normalized, 

J dU =1. (A.IO) 
In the quaternionic representation of SU(2), the Haar measure takes the form 



dw 



1 



27r2 



S{w^ - l)d'^w. 



(A.ll) 
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Dirac matrices 

We choose Dirac matrices in the chiral representation, where 

( -i(Ji,2,3 \ 



7l,2,3 



«(7l,2,3 



7. = ( ; ) . (A.13) 

These matrices fulfill 

7i = 7/., 7^=1, {7m, 7.} = 2(5^,.- (A.14) 

Also, in this representation, 75 = 70717273 has the property 75 = 75 and 
7I = 1. The matrices 

(^^^u^^b^^,lA (A.15) 
are hermitian and are explicitly given by 

' CTfc \ 



Cofe — 



-ak 



(^i3 = ^ijk ( 'Jj' j , (A.16) 



where e^-fc is the totally antisymmetric tensor. 



Lattice derivatives 

On the lattice, derivatives have to be discretized. We define forward and 
backward derivatives by 

= ^W-^^ (A.17) 

(X 

where fi denotes the unit vector in direction ji. 
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Appendix B 
Random numbers 



Generator 

All updating algorithms used in this work require a large number of random 
numbers. In practice, it is only feasible to use pseudo random numbers, which 
are produced by deterministic process. Here we use the generator by Liischer 
which is based on one proposed by Marsaglia and Zaman It is 



known to have a large periodicity time and its underlying dynamical system 
is chaotic, so it is assumed to generate numbers which do not introduce a 
bias into simulation results. 



Gaussian distribution 

2 

Random numbers with a distribution proportional to can be generated 
in pairs. This means, we generate the combined distribution 

P{y,,y2)=expi-yl-yl), (B.l) 

which gives two independent random numbers with the desired distribu- 
tion. The variables yi and y2 can be interpreted as coordinates in a two- 
dimensional plane, i.e. they can be transformed into polar coordinates 

yi = p cos 6 

y2 = psine. (B.2) 
The transformed variables have a distribution P{p, 6) with 



P(yi,y2)dyidy2 = P{p,9)dpd9 
d{p, 9) 



det%# = p. (B.3) 
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The distribution for the polar coordinates is thus given by 

oc ^{l-e-p"). (B.4) 

6 can simply be computed by multiplying a uniformly in [0,1[ distributed 
number by 2tt. p can be obtained by drawing u from a uniform distribution 
in [0, 1[ and computing^] 

p=(-log(l-M))i (B.5) 



^The usage of 1 — u avoids logarithms of zero, since u takes values smaller than 1. 
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Appendix C 
Error analysis 



An important aspect of the extraction of results from Monte Carlo runs is 
the estimation of errors. As the set of data from a finite simulation run is a 
sample from a statistical ensemble, there is of course a naive statistical error 



Furthermore, any Monte Carlo run needs some time before it reaches the 
thermal equilibrium, where field configurations are distributed according to 
the ensemble given by the Boltzmann factor. Including the thermalization 
phase into the average of an observable leads to an initialization bias. The 
only practical way to avoid this is to throw away a sufficiently high number 
of measurements at the beginning of the run. 

Another source of errors is the correlation between successive measure- 
ments. In general, an updating algorithm does not generate a chain of config- 
urations which are really independent of each other. One can express the real 
statistical error of an observable O as the product of the "naive error" and 
a factor 2rint, where Tint is an integrated autocorrelation time as described 
below. The autocorrelation time may differ between different observables. 

In the evaluation of autocorrelation times, it is essential to distinguish 
between primary quantities and secondary quantities. Primary quantities, 
such as the inverse coupling or the average plaquette, are directly extracted 
from a field configuration, i.e. they are expectation values of functions of the 
field variables. Secondary quantities are functions of two or more, in general 
correlated, primary quantities. In this work, an example for a secondary 
quantity is the current mass which is calculated as a quotient of fermionic 
correlation functions /a and /p. 



proportional to 
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Primary quantities 



Let us a sequence of measurements a* , z = 1 . . . iV of an observable A which 
has the exact mean value A = (A). We assume that these stem from a large 
enough ensemble in the thermal equilibrium, such that they have a Gaussian 
distribution following the central limit theorem. The natural estimator for 
A is the sample average, 

1 ^ . 

A^a:=—ya\ (C.l) 

This is the best estimator in the sense that the expectation value of d over 
an infinite ensemble of Monte Carlo runs is the correct value, 

(a - A)mc = 0. (C.2) 

If all configurations in the sample are uncorrelated, the variance of the 
normal distribution is 

^2 ^ {{A-Af) 



N 

{{d-Af)uc 



N 



(C.3) 



In the realistic case, where configurations are correlated, this underestimates 
the error. We introduce the autocorrelation function 



r{z-j) = {{a^-A){a^-A))^^ (C.4) 



between the i'th and the j'th measurement. F depends only on the distance 
between the measurements. Its value at zero is 

r(o) = ((i-A)^). (C.5) 

An estimator for F is 

1 ^ 

i=t+i 

In general, we neglect the error of r(t). The integrated autocorrelation time 
is defined a^l 

^This definition is one common convention, where Tint becomes 1/2 when there are no 
autocorrelations. 
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We can now express the "true" variance of the estimator a by the auto- 
correlation function, 



a' 



1 N N 

i=lj=l 

1 Af-1 

= iN-\t\)m 

t=~{N~l) 

~ r(0)^ forAr>t. (C.8) 

A comparison with ( |C.3| ) shows that the error behaves as if the effective 
number of measurements was N/{2T[^t) instead of N. Intuitively speaking, 
generating a completely decorrelated configuration from a given one requires 
2rint update steps. 

A popular method called binning takes this statement verbatim: in the 
binning method, one averages measurements over bins of length A'^bin and 
looks at the naively computed error of the binned data set. When A'bin 
is increased, subsequent bins should gradually become uncorrelated after it 
reaches the value 2rint. One expects the error in dependence of the bin 
length to converge to a plateau. In practice, the autocorrelation function 
has statistical errors which are enhanced for increasing Abin (which implies a 
decreasing number of bins) and the plateau is washed out. The determination 
of Tint and in particular its error is therefore difficult. 

Here, we use a different method explained in The main point here 
is not to sum up F over the whole possible range of times, but only over a 
window [-M,M], 

^--2£,f(oy ^^-'^ 

Making this window larger increases the variance of the estimated Tint, mak- 



ing it smaller increases its bias. As a compromise, Madras and Sokal 
propose to choose the window to be the smallest one which fulfills the condi- 
tion M > cfint, with a suitably chosen c. As fint depends on the choice of M, 
this plays the role of a self-consistency condition. We usually choose c = 6. 
With this method, the error of Tint can be estimated as 

Tint V A^ ^ ^ 
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Secondary quantities 



Let us consider a set of primary quantities Aa with exact mean values Aa. 
For each of them, there are measurement data a^,i = 1 . . .N, which have 
sample means a^. Now we want to study a secondary quantity which is given 
as a function / = /({A^}). A natural estimator for the value F = f{{Aa}) is 
/({oq,}). For primary quantities, we have seen in (|C.2|) that the expectation 
value of the estimator is identical with the exact mean and the variance goes 
with 1 /N, 

((a,-A„)2)Mc = 0(l/iV). (C.ll) 

For secondary quantities, this is a bit different. To see this, we follow [ fTSf 
and expand /({fla}) around the set of Aa. We denote the derivative of / 
with respects to its a'th argument as fa- Then the Taylor expansion to first 
order reads 

/({a J) =F + Y^ifla - Aa)fa{{Aa}) + O ('^^(a, - Aaf\ . (C.12) 
Thus, the estimate has an unavoidable bias 



MC 



/({aJ)-F)^^ = 0(l/iV). (C.13) 
This can be reduced by making large enough. The variance is 

= ((/({«.})- ^')^>M0 

This can be written in a more intuitive form by defining projected observables 

Ah = Ah = ^A„/„({a„}) 

" " (C.15) 

Oh = ^^aafaiiAa}) = X! ({^" }) ' 

a a 

such that 

a' = ((Ah - aH)n_ + 0(l/iV^) 



MC 

\2\ 



{A^-a^Y)^^. (C.16) 
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This is formally the same situation as in the case of primary observables. 
Consequently, the same methods can be applied here. In particular, given the 
analytical form of the derivatives of / with respect to all its arguments, we can 
compute the projected data a^^ and use them to determine the autocorrelation 
function T{t) according to ( P.6|) . This allows to compute the autocorrelation 
time Tint and the error cr^. 

As is argued in , the advantage of this "summation method" compared 
to jackknife is that the error of Tint decays much faster with the number of 
data points. Therefore, error estimates can be be accurate, which is par- 
ticularly important for performance studies, where Tint directly enters the 
definition of the cost of an algorithm. 

In this thesis, this aspect is less important. In the algorithms used here, 
the Schrodinger functional coupling can be measured as a primary quantity 
(in contrast to e.g. the PHMC algorithm, where the coupling is obtained by 
a reweighting procedure). Nevertheless, we have used the method described 



here e.g. for the determination of Csw in section [4.3| . The variable AM 



used there is computed from 12 primary observables, which means that 12 
derivatives have to be implemented for the computation of the projected 
observables. This shows the increased complexity compared to the more 
"universal" jackknife method. 
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